Week: Mini-Project #2
UPDATE
Click on image above to watch Midjourney AI-generated video using lyrics to "Don't Stop Me Now" by Queen
Overview
This week we bring together all the many disparate and not-so-disparate skills honed over the past several weeks into Mini-Project #2. Over the past few months, text to image (and now video) generation has seen a dramatic leap forward with the introduction of new large AI models. This also aligns nicely with our own AI research, so we can lend an unusual degree of experience to this rapidly evolving field.
Creating text prompts (prompt engineering) to feed into these text2image models is one of the hottest areas of AI research recently, and we'll explore this further this week. This involves an all-hands-on-deck class project that decomposes nicely into smaller groups: text to image generation by recently released state-of-the-art large DNN models.
GOAL:
Research, search and scrape Twitter for images and prompts based upon the following 3 state-of-the-art text2image deep neural network (DNN) models:
- DALL-E 2: @openai
- Midjourney: @midjourney
- Stable Diffusion: @stablediffusion
Our goal is to use what we've learned about scraping, APIs, NLP and image processing to find tweets with text generated images on Twitter. In particular, we want to collect the following:
- the author/organization
- text prompt with the associated
- generated image
- any other relevant information, comments or meta-information
Everyone will be assigned into one of three groups representing each of the main three text2image DNN models listed above. Each group will independently research, scrape and analyze as much data as they can for their assigned DNN model. Then as a group, we will combine, compare, and critique our findings as a unified team.
Readings
-
[Monday]:
- The absolute beginners guide to MidJourney AI. Starting with AI Art (start at 20:00 and peruse for a 5-10 minutes to get a sense of how to interactively design prompt for image generation)
- Research all 3 models including (start with the links at the top of this page under GOALS)
- Websites (official and tutorials)
- Reddit subthreads (r/subreddits)
- Twitter (official and tutorials)
- Twitter (artists and programmers)
- The Mind-Blowing DALLĀ·E 2 Prompt Book Expand and read the description immediately below the YouTube video for links to various text2image social media users and groups.
- DALL-e Prompt Book
- Review Twitter API (version 2) from previous weeks
-
[Wednesday]:
- Design Guidelines for Prompt Engineering Text-to-Image Generative Models (7:33) Brief overview of ACM paper on prompt engineering experiments
- (Gallery) Lexica.art Scroll through, search and roll-over to view text prompt that generated each image
- (Gallery) PromptHero.com
- (Gallery) NightCafe.Studio
-
[Friday]:
- Lexica.art: API Instructions
- Freecodecamp.com: Web Scraping with Python, Tweepy and Snscrape
- Github Action to Scrape #depression Tweets Daily
- How to Scrape Twitter with Snscrape with Repo
- scrape_twitter_7day.py - For Normal dev accounts that can only scrape over the last 7 days (update the start/end date vars to correspond to a current trailing 7day window)
- scrape_twitter_freecc.py For academic research dev accounts that can scrape beyond 7 days.
- scrape_reddit_sentiment.py Simple Reddit Scraper
Assignments:
Teams:
Social Media | DALL-E 2 | Midjourney | Stable Diffusion |
---|---|---|---|
Freddie | Jill | Jeremy | |
Ani | Claire | Vikas | |
Viet | Max | Devon | |
Teddy | Anav | Abbie | |
Tao |
Identify Scrape Targets: * Official Social Media Accounts (e.g. @handles) * Groups/Boards of Fans (e.g. subreddits) * Search Terms (e.g. regular or #hashtags) * Individual Artists * Gallaries (e.g. Lexica.art) * Other sources (e.g. Slack, Discord)
Guidance: * Keep good notes as you go * Give FULL DETAILS we need to scrape BOTH (a) Generated Images and (b) text prompts (+ possible explainations/given context) - Full URL paths/subdirectories - Unique usernames/handles/hashtags - Distinct and effective keywords/search terms - etc * Try to group common patterns into a taxonomy as you to, make notes on what distinctive features you are using to base your classification on
References
- unicodedata library
- tweets_df['Tweet'] = tweets_df['Tweet'].apply(lambda x: unicodedata.normalize('NFD', x).encode('ascii', 'ignore'))
- Git - The Simple Guide
- Promptbase: A marketplace for text engineering
- Awesome Prompt Papers
- Python os.environ() vs python-dotenv
- Twitter Hashtag Search: keywordtool
- Twitter Hashtag Search: tagsfinder
Twitter API Ver 2
- Twitter Developer Platform Resources
- An Extenstive Guide to Collecting Tweets from Twitter API v2 for Academic Research Using Python 3 (Academic 10M/mo access level)
- A Comprehensive Guide for Using the Twitter API v2 Using Tweepy
- Twitter API Ver 2: Reference
- Twitter API Ver 2: Playground
- Twitter API Ver 2: Query Builder
- Twitter API Ver 2 Annotations: Entities.Context
- Twitter API Ver 2 Data Dictionary
- Twitter API Ver 2: Examples
- Twitter API Ver 2: Sample Code
- Twitter API Ver 2: Notification via Integration with AWS/Twilio (Java)
- Sentiment Analysis of Live Tweets (12:00) 3/3
Discord
- Awesome Discord Communities
- Discord Official API
- Discord API Python Wrapper
- Discord RedBot
- PyDiscord Bot
- DiscordChatExporter (Win GUI/CLI)
Slack
Other Scrapers & Bots
- Awesome Bots (8/21)
- Rasa Multichat Bot ML Automation
- Mattermost Multichat Bridge
- disease-sh API and c19 scraper (JS)
- Wayback Machine Scraper (CL)