How to Send Web Pages to LM Studio with Share2Agent
Process web pages with local LLM models running in LM Studio. Share2Agent extracts page content and sends it to a webhook receiver that calls LM Studio's OpenAI-compatible API. Summarize, translate, or analyze any page without sending data to the cloud.
Prerequisites
- LM Studio installed with a model loaded (lmstudio.ai)
- Local server running in LM Studio (the "Developer" tab)
- Share2Agent Chrome extension installed
- Python 3.10+
Step 1: Start the LM Studio Server
- Open LM Studio and load a model (e.g., Llama 3.2, Mistral, Phi-3).
- Go to the Developer tab.
- Click Start Server. The server runs at
http://localhost:1234by default. - Verify it works:
curl http://localhost:1234/v1/modelsYou should see a JSON response listing your loaded model.
Step 2: Create the Webhook Receiver
This script receives pages from Share2Agent and sends the content to LM Studio's OpenAI-compatible endpoint.
Save this as lmstudio_receiver.py:
#!/usr/bin/env python3
"""Share2Agent → LM Studio webhook receiver."""
import json
import urllib.request
from datetime import datetime
from http.server import HTTPServer, BaseHTTPRequestHandler
from pathlib import Path
PORT = 9876
OUTPUT_DIR = Path.home() / "share2agent-lmstudio"
LM_STUDIO_URL = "http://localhost:1234/v1/chat/completions"
DEFAULT_PROMPT = "Summarize this article in 3-5 bullet points."
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
def call_lm_studio(prompt: str, content: str) -> str:
payload = json.dumps({
"messages": [
{"role": "system", "content": prompt},
{"role": "user", "content": content},
],
"temperature": 0.3,
"max_tokens": 2048,
"stream": False,
}).encode()
req = urllib.request.Request(
LM_STUDIO_URL,
data=payload,
headers={"Content-Type": "application/json"},
)
with urllib.request.urlopen(req, timeout=120) as resp:
data = json.loads(resp.read())
return data["choices"][0]["message"]["content"]
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
length = int(self.headers.get("Content-Length", 0))
data = json.loads(self.rfile.read(length))
title = data.get("title", "untitled")
content = data.get("content", "")
comment = data.get("comment", "").strip()
prompt = comment if comment else DEFAULT_PROMPT
print(f"Processing: {title[:60]}...")
result = call_lm_studio(prompt, content)
# Save result
ts = datetime.now().strftime("%Y-%m-%d-%H%M")
slug = title[:40].lower().replace(" ", "-")
out = OUTPUT_DIR / f"{ts}-{slug}.md"
out.write_text(
f"# {title}\n\n"
f"**Prompt:** {prompt}\n"
f"**Source:** {data.get('url', '')}\n\n"
f"---\n\n{result}\n"
)
print(f"Saved: {out.name}")
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers()
self.wfile.write(json.dumps({"status": "ok"}).encode())
def do_OPTIONS(self):
self.send_response(204)
self.send_header("Access-Control-Allow-Origin", "*")
self.send_header("Access-Control-Allow-Methods", "POST, OPTIONS")
self.send_header("Access-Control-Allow-Headers", "Content-Type")
self.end_headers()
if __name__ == "__main__":
print(f"LM Studio receiver on :{PORT}")
HTTPServer(("0.0.0.0", PORT), Handler).serve_forever()The key difference from Ollama: LM Studio uses the OpenAI chat completions format (/v1/chat/completions with messages array) instead of Ollama's generate endpoint. The comment from Share2Agent becomes the system prompt, and the page content becomes the user message.
Step 3: Run the Receiver
python3 -u lmstudio_receiver.pyStep 4: Configure Share2Agent
- Click the Share2Agent extension icon in Chrome.
- Open Settings.
- Set the Webhook URL to
http://localhost:9876. - Save.
Step 5: Process a Page
- Navigate to any web page.
- Click the Share2Agent icon.
- Type an instruction in the comment field:
Summarize the key argumentsExtract all API endpoints mentionedTranslate to GermanList action items from this meeting notes page
- Click Share.
Leave the comment empty to use the default summarization prompt.
Results are saved to ~/share2agent-lmstudio/:
~/share2agent-lmstudio/2026-03-28-1430-react-server-components.md
Customization
Model selection: LM Studio uses whichever model is currently loaded. Switch models in LM Studio's UI -- no code changes needed.
Temperature and tokens: Adjust temperature (0.0 = deterministic, 1.0 = creative) and max_tokens in the script to control output style and length.
Server port: If you changed LM Studio's server port, update the LM_STUDIO_URL variable.
What's Next?
- Add model routing -- parse the comment for keywords (e.g., "creative:" or "precise:") and adjust temperature dynamically.
- Build a reading list processor -- share multiple articles, then run a second script that reads all saved results and generates a combined brief.
- Connect to other OpenAI-compatible tools -- since LM Studio speaks the OpenAI API format, you can swap it for any compatible backend (vLLM, llama.cpp server, etc.) without changing the receiver code.