S3-Compatible-Instagram-Scraper

Overview

This project is an extension of the Instagram scraper built by rarcega.

It is designed to organize the scraped instagram data neatly in AWS S3, according to this structure:

S3_BUCKET_NAME/
|
|-- instagram/
   |-- TARGET_USER
      |-- full-metadata.json: Contains metadata for entire operation
      |-- [POST_ID_X]
         |-- [POST_ID_X].jpg: Image of the post
         |-- summary.json: Key information associated with post
      |-- [POST_ID_Y]
         |-- [POST_ID_Y].jpg
         |-- summary.json
      | ...

Each post by the target instagram user is stored in its own folder.
Each folder contains the image as well as the post’s associated metadata.

Getting Started

Prerequisites

These instructions were designed for Ubuntu 18.04.

You will need to create a config.py file with the following contents:

AWS_ACCESS_KEY_ID = [YOUR AWS_ACCESS_KEY_ID]
AWS_SECRET_ACCESS_KEY = [YOUR AWS_SECRET_ACCESS_KEY]
AWS_REGION_NAME = [YOUR AWS_REGION_NAME]
S3_BUCKET_NAME = [YOUR AWS_S3_BUCKET_NAME]
INSTAGRAM_USER_ID = [YOUR INSTAGRAM_USER_ID]
INSTAGRAM_USER_PASSWORD = [YOUR INSTAGRAM_USER_PASSWORD]
TARGET_INSTAGRAM_USER = [YOUR TARGET_INSTAGRAM_USER TO SCRAPE DATA FROM]

A config_template.py file has been provided for your convenience.

Now, follow these instructions to get the variables above.

Lines 1-3 relating to AWS.
Line 4 relating to AWS S3.
Lines 5-7 are self-explanatory. The TARGET_INSTAGRAM_USER refers to the name of the user you intend to scrape data from.

NOTE: Your userId and password are required to scrape data from private users followed by you.

Installation

Clone this repository.

git clone https://github.com/Jordan396/S3-Compatible-Instagram-Scraper.git
cd S3-Compatible-Instagram-Scraper/

Create a venv and activate it.

python3 -m venv venv
source venv/bin/activate

Install dependencies.
```
pip install -r requirements.txt
```
Add your config.py above to the base directory.
Start scraping!
```
python scrape.py
```
Navigate to your S3 bucket to view the scraped data.

S3-Compatible-Instagram-Scraper

Scrapes an instagram user's photos and organizes the data in Amazon S3

S3-Compatible-Instagram-Scraper

Overview

Getting Started

Prerequisites

Installation