Comprehensive hands-on tutorials for using and extending the Multi-Modal Academic Research System.
If you're new to the system, we recommend following the tutorials in this order:
- Collecting Papers - Learn how to collect academic content
- Custom Searches - Master advanced search techniques
- Exporting Citations - Manage and export your research citations
- Visualization Dashboard - Analyze your collection with visualizations
- Extending the System - Customize and extend functionality
File: collect-papers.md Level: Beginner Time: 30-45 minutes
Learn how to collect academic papers from multiple sources including ArXiv, Semantic Scholar, and PubMed Central.
Topics Covered:
- Using the Gradio UI for data collection
- Python API for programmatic collection
- Different search strategies for each source
- Batch collection and automation
- Troubleshooting common issues
- Best practices for paper collection
What You'll Build:
- A collection of 100+ papers on your research topic
- Automated collection scripts
- Deduplication and filtering workflows
File: custom-searches.md Level: Intermediate Time: 45-60 minutes
Master advanced search techniques using Boolean operators, field boosting, filters, and OpenSearch Query DSL.
Topics Covered:
- Advanced query syntax (AND, OR, NOT, wildcards)
- Field-specific searches and boosting
- Combining multiple filters
- Date, author, and category filtering
- OpenSearch Query DSL
- Search relevance optimization
- Custom re-ranking algorithms
What You'll Build:
- Custom search queries for precise results
- Multi-criteria filtering scripts
- Relevance tuning configurations
- Similarity search functionality
File: export-citations.md Level: Beginner to Intermediate Time: 30-40 minutes
Learn how to export and manage citations in various formats for your research writing.
Topics Covered:
- Understanding citation tracking
- Exporting from the Gradio UI
- Programmatic export via Python
- Multiple citation formats (BibTeX, APA, MLA, Chicago)
- Integrating with Zotero, Mendeley, and EndNote
- Creating custom citation formats
- Automated export workflows
What You'll Build:
- Bibliography files in multiple formats
- Integration with reference managers
- Custom citation formatters
- Automated export scripts
- Citation usage reports
File: visualization.md Level: Intermediate Time: 40-50 minutes
Explore and analyze your research collection using the FastAPI visualization dashboard.
Topics Covered:
- Starting the FastAPI dashboard
- Understanding collection statistics
- Filtering and searching data
- Exporting data in various formats
- Creating custom visualizations
- Using the REST API
- Real-time monitoring
What You'll Build:
- Interactive charts and graphs
- Custom analytics dashboards
- Data export pipelines
- API client for automation
- Collection monitoring tools
File: extending.md Level: Advanced Time: 60-90 minutes
Learn how to extend the system with new data collectors, processors, UI components, and search filters.
Topics Covered:
- System architecture overview
- Creating new data collectors
- Building custom processors
- Modifying the Gradio UI
- Adding search filters
- Writing tests
- Contributing to the project
What You'll Build:
- Custom blog post collector
- GitHub repository collector
- Custom data processors
- New UI components
- Advanced search filters
- Test suites
Focus on using the system for your research:
- Collecting Papers - Build your research database
- Custom Searches - Find relevant papers efficiently
- Exporting Citations - Manage citations for writing
- Visualization Dashboard - Analyze your collection
Focus on extending and customizing the system:
- Collecting Papers - Understand data flow
- Custom Searches - Learn search architecture
- Extending the System - Add new features
- Visualization Dashboard - Build analytics tools
Focus on data analysis and visualization:
- Collecting Papers - Gather data
- Visualization Dashboard - Analyze patterns
- Custom Searches - Extract insights
- Extending the System - Add custom analytics
Before starting the tutorials, ensure you have:
- Python 3.8+ installed
- Virtual environment activated
- Dependencies installed (
pip install -r requirements.txt) - OpenSearch running (via Docker)
- Gemini API key configured in
.env
Quick setup:
# Clone repository
git clone https://github.com/your-repo/multi-modal-academic-research-system.git
cd multi-modal-academic-research-system
# Create virtual environment
python -m venv venv
source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY
# Start OpenSearch
docker run -p 9200:9200 -e "discovery.type=single-node" opensearchproject/opensearch:latest
# Start application
python main.pyEach tutorial includes:
- Step-by-step instructions with clear explanations
- Complete code examples that you can copy and run
- Practical use cases based on real research scenarios
- Troubleshooting sections for common issues
- Best practices and tips
- Common pitfalls to avoid
- Next steps and related tutorials
All code examples from the tutorials are available in the examples/ directory:
examples/collection/- Paper collection scriptsexamples/search/- Advanced search examplesexamples/citations/- Citation export scriptsexamples/visualization/- Visualization examplesexamples/extensions/- Custom extensions
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions and share ideas
- Pull Requests: Contribute improvements
If you encounter issues while following the tutorials:
- Check the troubleshooting section in each tutorial
- Review the logs in
logs/directory - Consult the main documentation in
docs/ - Search GitHub issues for similar problems
- Ask for help in GitHub Discussions
We welcome improvements to these tutorials! If you:
- Find an error or typo
- Have a suggestion for clarification
- Want to add a new example
- Discovered a better approach
Please submit a pull request or open an issue.
If you'd like to contribute a new tutorial:
- Follow the existing tutorial structure
- Include practical, working code examples
- Test all code examples before submitting
- Add troubleshooting tips based on real issues
- Include screenshots or diagrams where helpful
- Link to related tutorials and documentation
We'd love to hear your feedback on these tutorials:
- What worked well?
- What was confusing?
- What's missing?
- What would you like to learn more about?
Please share your thoughts via GitHub issues or discussions.
Last Updated: October 2024 Version: 1.0.0 Maintained by: Multi-Modal Research System Team