A powerful, easy-to-use Python tool for extracting high-engagement tweets from Twitter/X with advanced filtering capabilities
Features β’ Installation β’ Quick Start β’ Documentation β’ Examples
Twitter Advanced Scraper is a robust Python library that enables you to extract tweets from Twitter/X based on search queries or user profiles. Filter results by engagement metrics (likes, comments, views) to find the most impactful content. Perfect for social media analysis, market research, trend monitoring, and content curation.
- β Engagement Filtering - Find tweets with minimum likes, comments, and views
- β Dual Search Modes - Search by keywords or specific user profiles
- β Rich Data Export - Get complete tweet metadata in JSON format
- β Smart Pagination - Automatically fetch multiple pages of results
- β Cookie Rotation - Built-in cookie management for reliability
- β Easy Integration - Simple API that works out of the box
| Feature | Description |
|---|---|
| Search Timeline | Search Twitter for any keyword or hashtag |
| Profile Scraping | Extract tweets from specific user profiles |
| Engagement Filters | Set minimum thresholds for likes, comments, and views |
| Flexible Sorting | Choose between "Top", "Latest", or "People" results |
| Auto-Pagination | Fetch multiple pages automatically |
| JSON Export | Save results in clean, structured JSON format |
| Error Handling | Built-in retry logic and error management |
- Python 3.7 or higher
- pip package manager
git clone https://github.com/yourusername/twitter-advanced-scraper.git
cd twitter-advanced-scraperpip install requests mahdixAdd your Twitter cookies to the cookies_list in main.py:
cookies_list = [
'your_cookie_string_here'
]π‘ Tip: You can add multiple cookies for rotation and reliability.
from main import main
# Search for tweets about "Python programming" with high engagement
results = main(
search_topi=['Python programming'],
count=20,
reaction_count=100, # Minimum likes
views_count=1000, # Minimum views
comment_count=10, # Minimum comments
mathods='TwitterSearchTimeline',
number_lop=5
)
print(results)from main import main
# Get tweets from Elon Musk's profile
results = main(
search_topi=['elonmusk'],
count=15,
reaction_count=500,
views_count=10000,
comment_count=50,
mathods='profile_search',
number_lop=3
)
print(results)main(search_topi, count, reaction_count, views_count, comment_count, mathods, number_lop)| Parameter | Type | Description | Example |
|---|---|---|---|
search_topi |
list | Search queries or usernames | ['AI', 'ChatGPT'] or ['elonmusk'] |
count |
int | Maximum tweets to retrieve | 50 |
reaction_count |
int | Minimum likes required | 100 |
views_count |
int | Minimum views required | 1000 |
comment_count |
int | Minimum comments required | 10 |
mathods |
str | Search method: 'TwitterSearchTimeline' or 'profile_search' |
'TwitterSearchTimeline' |
number_lop |
int | Number of pagination loops | 5 |
Search Twitter by keywords, hashtags, or phrases.
When to use: Finding trending topics, tracking hashtags, discovering viral content
results = main(
search_topi=['#AI', 'machine learning'],
count=30,
reaction_count=200,
views_count=5000,
comment_count=20,
mathods='TwitterSearchTimeline'
)Available Products:
'Top'- Most relevant and engaging tweets'Latest'- Most recent tweets'People'- Tweets from people you might follow
Extract tweets from specific user profiles.
When to use: Analyzing influencer content, monitoring brand accounts, tracking competitors
results = main(
search_topi=['jack', 'naval'],
count=25,
reaction_count=300,
views_count=10000,
comment_count=30,
mathods='profile_search'
)# Find highly engaged cryptocurrency tweets
crypto_tweets = main(
search_topi=['#Bitcoin', '#Ethereum', '#Crypto'],
count=100,
reaction_count=1000,
views_count=50000,
comment_count=100,
mathods='TwitterSearchTimeline',
number_lop=10
)
# Save to file
import json
with open('crypto_trends.json', 'w', encoding='utf-8') as f:
json.dump(crypto_tweets, f, ensure_ascii=False, indent=4)# Compare content from tech leaders
influencers = main(
search_topi=['sama', 'pmarca', 'naval', 'elonmusk'],
count=50,
reaction_count=500,
views_count=100000,
comment_count=50,
mathods='profile_search',
number_lop=5
)
# Process results
for user_data in influencers:
username = user_data['keyword']
tweet_count = len(user_data['tweets'])
print(f"{username}: {tweet_count} high-engagement tweets found")# Track news as it happens
news_tweets = main(
search_topi=['breaking news', 'latest update'],
count=30,
reaction_count=50,
views_count=10000,
comment_count=5,
mathods='TwitterSearchTimeline',
number_lop=3
)
# Filter by recency
from datetime import datetime
recent_tweets = [
tweet for user in news_tweets
for tweet in user['tweets']
if 'Jan 2026' in tweet['date_of_post']
]The scraper returns data in this structure:
[
{
"type": "profile",
"keyword": "elonmusk",
"tweets": [
{
"post_id": "1234567890123456789",
"post_url": "https://x.com/elonmusk/status/1234567890123456789",
"date_of_post": "Mon Jan 20 15:30:00 +0000 2026",
"tweet": "Full tweet text here...",
"likes": 15000,
"comments": 850,
"views": "2500000"
}
]
}
]| Field | Description |
|---|---|
type |
Type of search performed ("profile" or "search") |
keyword |
Search query or username used |
post_id |
Unique Twitter post ID |
post_url |
Direct URL to the tweet |
date_of_post |
Timestamp when tweet was posted |
tweet |
Full text content of the tweet |
likes |
Number of likes (favorites) |
comments |
Number of replies |
views |
Number of views (impressions) |
- Identify viral content in your niche
- Track competitor engagement strategies
- Find influencers for partnerships
- Analyze consumer sentiment
- Track brand mentions and perception
- Monitor industry trends and discussions
- Discover trending topics for your audience
- Find high-quality content to share
- Build content calendars based on engagement data
- Study social media patterns and behaviors
- Analyze public discourse on topics
- Collect data for sentiment analysis
- Monitor breaking stories in real-time
- Track public reactions to events
- Find eyewitness accounts and sources
Modify request headers in the TwitterSearchTimeline class:
self.headers = {
'authorization': 'Bearer YOUR_BEARER_TOKEN',
'x-csrf-token': 'YOUR_CSRF_TOKEN',
# ... other headers
}Add multiple cookies for better reliability:
cookies_list = [
'cookie_1',
'cookie_2',
'cookie_3'
]The scraper automatically rotates between cookies to distribute requests.
Adjust the number of loops for deeper searches:
# Shallow search (faster)
results = main(..., number_lop=2)
# Deep search (more results)
results = main(..., number_lop=15)Issue: "Rate limit exceeded"
- Solution: Add more cookies or reduce
number_lopparameter
Issue: "No results found"
- Solution: Lower your engagement thresholds (reaction_count, views_count, comment_count)
Issue: "Invalid cookie"
- Solution: Update your cookies from a fresh Twitter session
Issue: "Connection timeout"
- Solution: Check your internet connection or add proxy support
- Log into Twitter/X in your browser
- Open Developer Tools (F12)
- Go to Application β Cookies β https://x.com
- Copy the entire cookie string
- Paste into
cookies_listin main.py
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Add support for more search filters
- Implement proxy rotation
- Add data visualization features
- Improve error handling
- Add unit tests
- Create a web interface
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational and research purposes only. Please ensure you comply with Twitter's Terms of Service and API usage policies. The developers are not responsible for any misuse of this software.
Important Notes:
- Respect rate limits to avoid account restrictions
- Use cookies from your own accounts only
- Don't use for spam or harassment
- Be mindful of privacy and data protection laws
Have a project in mind or need expert help?
I'm available for freelance work and paid collaborations.
- π§ Email: shuvobbhh@gmail.com
- π¬ Telegram: +8801616397082
- π± WhatsApp: +8801616397082
- π Portfolio: mahdi-hasan-shuvo.github.io
- π» GitHub: @Mahdi-hasan-shuvo
"Quality work speaks louder than words. Let's build something remarkable together."
Built with β€οΈ for seamless file synchronization