Getting Started¶
This guide walks you through setting up the Scientific Literature Explorer on your machine.
Prerequisites¶
Before installing, ensure you have:
- Python ≥ 3.10 (3.11+ recommended)
- A ScaleDown API key — Get yours at ScaleDown Getting Started
- A Google Gemini API key — Free tier available at Google AI Studio
Installation¶
1. Clone the Repository¶
2. Create a Virtual Environment¶
3. Install Dependencies¶
Dependencies include:
- requests — HTTP client for ScaleDown, Gemini, and ArXiv APIs
- python-dotenv — Load .env configuration
- numpy — Array operations for TF-IDF
- scikit-learn — TF-IDF vectorizer and cosine similarity
- PyPDF2 — PDF text extraction from ArXiv papers
- rich — Terminal UI (tables, panels, markdown, progress spinners)
Configuration¶
1. Create .env File¶
Copy the example environment file:
2. Add Your API Keys¶
Open .env in your favorite editor and fill in your keys:
# Required
SCALEDOWN_API_KEY=your_scaledown_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
# Optional Configuration Overrides
SCALEDOWN_MODEL=gemini-2.5-flash # Target model for compression optimization
GEMINI_MODEL=gemini-2.5-flash # Gemini model to use
CHUNK_SIZE=1000 # Characters per chunk
CHUNK_OVERLAP=200 # Overlap between chunks
TOP_K=5 # Number of chunks to retrieve
SCALEDOWN_TIMEOUT=15 # Timeout in seconds for ScaleDown API
See Configuration Reference for detailed explanations of each variable.
Directory Structure¶
The system will automatically create these folders on first use:
RAG/
├── papers/ # Downloaded PDFs and extracted text
├── artifacts/ # Stored reasoning outputs (COT, verify, critique)
│ ├── cot/
│ ├── self_verify/
│ └── self_critique/
└── sessions/ # Conversation history (JSON per session)
Verify Installation¶
Test that everything is working:
If you see a response, you're ready to go! 🎉
Next: Usage Examples¶
See Usage Guide for all available commands and examples.