Quick Start – Toledo1

Provider API Links

OpenAI: https://platform.openai.com/api-keys
Anthropic: https://console.anthropic.com/settings/keys
xAI: https://x.ai/api
Perplexity: https://www.perplexity.ai/settings/api
OpenRouter: https://openrouter.ai/settings/keys
ollama (Run AI models on device): https://ollama.com

Quick Setup Instructions

Initial Setup:
- Install Toledo1 and launch the application
- Navigate to Settings > Presets to view available models
- Preset 9 (Llama-70b) is enabled by default – free to use
Configure API Keys:
- Click provider links above to get your API keys
- Copy your API key from provider website
- Paste key into corresponding preset’s API Key field
- Save changes to activate the preset
Run On Device Inference On Preset5:
- Run AI models on device with this tutorial
Select & Enable Models:
- Choose desired preset based on your needs (see descriptions below)
- Append :online to model name if on a OpenRouter preset for web search results
- Click the Enable button for your selected preset
- Return to ChatLog tab to begin using the model
Start Using Toledo1:
- Type your queries in the chat input
- Right-click and select ‘Clear’ or type :clear to start a new chat
- Switch between presets anytime in Settings

Note: Each preset is optimized for different tasks. Experiment with different models to find the best fit for your needs.

Preset 1

URL: https://api.openai.com/v1
Model: gpt-4o
Note: Agentic model, high reasoning, great at coding
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 2

URL: https://api.openai.com/v1
Model: gpt-4o-mini
Note: Low cost, great at coding tasks
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 3

URL: https://api.anthropic.com/v1
Model: claude-3-5-haiku-latest
Note: Low cost, great at coding tasks
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 4

URL: https://api.anthropic.com/v1
Model: claude-3-5-sonnet-latest
Note: Agentic model, high reasoning, great at analytics and coding
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 5 (On device Inference with ollama)

URL: http://localhost:11434/v1
Model: llama3.2:latest
Note: Great for general search queries
System: You are a helpful search assistant.
Max Context Tokens: 8192
Temperature: 0.8

Preset 6

URL: https://api.x.ai/v1
Model: grok-2-vision-1212
Note: High reasoning model with vision capability
System: You are a helpful search assistant.
Max Context Tokens: 8192
Temperature: 0.8

Preset 7

URL: https://api.x.ai/v1
Model: grok-2-latest
Note: High reasoning model, great at math
System: You are a helpful search assistant.
Max Context Tokens: 131072
Temperature: 0.8

Preset 8 (Online)

URL: https://api.perplexity.ai
Model: sonar-pro
Note: web search, ex: events, weather, stock prices
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 9 (Free Default)

URL: https://api.cerebras.ai/v1
Model: llama3.3-70b
Note: Fast responses, best for general search queries at a cost of zero
System: You are a helpful search assistant.
Max Context Tokens: 8,192
Temperature: 0.8

Preset 10

URL: https://openrouter.ai/api/v1
Model: deepseek/deepseek-r1
Note: High reasoning model with vision capability at a low cost, append :online to model name for web search results
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 11

URL: https://openrouter.ai/api/v1
Model: o1
Note: Reasoning model designed to solve hard problems across domains, append :online to model name for web search results
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8

Preset 12

URL: https://openrouter.ai/api/v1
Model: o3-mini
Note: Fast and affordable reasoning model for specialized tasks, append :online to model name for web search results
System: You are a helpful search assistant.
Max Context Tokens: 128,000
Temperature: 0.8