Local LLMs

How to Send Web Pages to LM Studio with Share2Agent

Process web pages with local LLM models running in LM Studio. Share2Agent extracts page content and sends it to a webhook receiver that calls LM Studio's OpenAI-compatible API. Summarize, translate, or analyze any page without sending data to the cloud.


Prerequisites

  • LM Studio installed with a model loaded (lmstudio.ai)
  • Local server running in LM Studio (the "Developer" tab)
  • Share2Agent Chrome extension installed
  • Python 3.10+

Step 1: Start the LM Studio Server

  1. Open LM Studio and load a model (e.g., Llama 3.2, Mistral, Phi-3).
  2. Go to the Developer tab.
  3. Click Start Server. The server runs at http://localhost:1234 by default.
  4. Verify it works:
bash
curl http://localhost:1234/v1/models

You should see a JSON response listing your loaded model.


Step 2: Create the Webhook Receiver

This script receives pages from Share2Agent and sends the content to LM Studio's OpenAI-compatible endpoint.

Save this as lmstudio_receiver.py:

python
#!/usr/bin/env python3
"""Share2Agent → LM Studio webhook receiver."""
 
import json
import urllib.request
from datetime import datetime
from http.server import HTTPServer, BaseHTTPRequestHandler
from pathlib import Path
 
PORT = 9876
OUTPUT_DIR = Path.home() / "share2agent-lmstudio"
LM_STUDIO_URL = "http://localhost:1234/v1/chat/completions"
DEFAULT_PROMPT = "Summarize this article in 3-5 bullet points."
 
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
 
 
def call_lm_studio(prompt: str, content: str) -> str:
    payload = json.dumps({
        "messages": [
            {"role": "system", "content": prompt},
            {"role": "user", "content": content},
        ],
        "temperature": 0.3,
        "max_tokens": 2048,
        "stream": False,
    }).encode()
    req = urllib.request.Request(
        LM_STUDIO_URL,
        data=payload,
        headers={"Content-Type": "application/json"},
    )
    with urllib.request.urlopen(req, timeout=120) as resp:
        data = json.loads(resp.read())
        return data["choices"][0]["message"]["content"]
 
 
class Handler(BaseHTTPRequestHandler):
    def do_POST(self):
        length = int(self.headers.get("Content-Length", 0))
        data = json.loads(self.rfile.read(length))
 
        title = data.get("title", "untitled")
        content = data.get("content", "")
        comment = data.get("comment", "").strip()
        prompt = comment if comment else DEFAULT_PROMPT
 
        print(f"Processing: {title[:60]}...")
        result = call_lm_studio(prompt, content)
 
        # Save result
        ts = datetime.now().strftime("%Y-%m-%d-%H%M")
        slug = title[:40].lower().replace(" ", "-")
        out = OUTPUT_DIR / f"{ts}-{slug}.md"
        out.write_text(
            f"# {title}\n\n"
            f"**Prompt:** {prompt}\n"
            f"**Source:** {data.get('url', '')}\n\n"
            f"---\n\n{result}\n"
        )
        print(f"Saved: {out.name}")
 
        self.send_response(200)
        self.send_header("Content-Type", "application/json")
        self.send_header("Access-Control-Allow-Origin", "*")
        self.end_headers()
        self.wfile.write(json.dumps({"status": "ok"}).encode())
 
    def do_OPTIONS(self):
        self.send_response(204)
        self.send_header("Access-Control-Allow-Origin", "*")
        self.send_header("Access-Control-Allow-Methods", "POST, OPTIONS")
        self.send_header("Access-Control-Allow-Headers", "Content-Type")
        self.end_headers()
 
 
if __name__ == "__main__":
    print(f"LM Studio receiver on :{PORT}")
    HTTPServer(("0.0.0.0", PORT), Handler).serve_forever()

The key difference from Ollama: LM Studio uses the OpenAI chat completions format (/v1/chat/completions with messages array) instead of Ollama's generate endpoint. The comment from Share2Agent becomes the system prompt, and the page content becomes the user message.


Step 3: Run the Receiver

bash
python3 -u lmstudio_receiver.py

Step 4: Configure Share2Agent

  1. Click the Share2Agent extension icon in Chrome.
  2. Open Settings.
  3. Set the Webhook URL to http://localhost:9876.
  4. Save.

Step 5: Process a Page

  1. Navigate to any web page.
  2. Click the Share2Agent icon.
  3. Type an instruction in the comment field:
    • Summarize the key arguments
    • Extract all API endpoints mentioned
    • Translate to German
    • List action items from this meeting notes page
  4. Click Share.

Leave the comment empty to use the default summarization prompt.

Results are saved to ~/share2agent-lmstudio/:

~/share2agent-lmstudio/2026-03-28-1430-react-server-components.md

Customization

Model selection: LM Studio uses whichever model is currently loaded. Switch models in LM Studio's UI -- no code changes needed.

Temperature and tokens: Adjust temperature (0.0 = deterministic, 1.0 = creative) and max_tokens in the script to control output style and length.

Server port: If you changed LM Studio's server port, update the LM_STUDIO_URL variable.


What's Next?

  • Add model routing -- parse the comment for keywords (e.g., "creative:" or "precise:") and adjust temperature dynamically.
  • Build a reading list processor -- share multiple articles, then run a second script that reads all saved results and generates a combined brief.
  • Connect to other OpenAI-compatible tools -- since LM Studio speaks the OpenAI API format, you can swap it for any compatible backend (vLLM, llama.cpp server, etc.) without changing the receiver code.