Still dealing with a tail of bugs, some of which look like overzealous optimizations leading to loss of pointer capability (leading to a filc panic). But it works well enough that I can say "hi" on here.
The idea started as a pretty simple question: text chatbots are everywhere, but they rarely feel present. I wanted something closer to a call, where the character actually reacts in real time (voice, timing, expressions), not just “type, wait, reply”.
Beni is basically:
A Live2D avatar that animates during the call (expressions + motion driven by the conversation)
Real-time voice conversation (streaming response, not “wait 10 seconds then speak”)
Long-term memory so the character can keep context across sessions
The hardest part wasn’t generating text, it was making the whole loop feel synchronized: mic input, model response, TTS audio, and Live2D animation all need to line up or it feels broken immediately. I ended up spending more time on state management, latency and buffering than on prompts.
Some implementation details (happy to share more if anyone’s curious):
Browser-based real-time calling, with audio streaming and client-side playback control
Live2D rendering on the front end, with animation hooks tied to speech / state
A memory layer that stores lightweight user facts/preferences and conversation summaries to keep continuity
Current limitation: sign-in is required today (to persist memory and prevent abuse). I’m adding a guest mode soon for faster try-out and working on mobile view now.
What I’d love feedback on:
Does the “real-time call” loop feel responsive enough, or still too laggy?
Any ideas for better lip sync / expression timing on 2D/3D avatars in the browser?
Thanks, and I’ll be around in the comments.
He described how they measured atomic hand movements (reach, grasp, orient) in decimal seconds to balance the line. But he made a distinction that stuck with me:
Back then, the goal was Flow (smoothness), which inherently required some slack in the system. Today, he argued, the goal of modern management is Utilization (removing every micro-second of downtime).
His quote: "We deleted the 'waiting,' but we forgot that the waiting was the only time the human got to breathe."
I feel like I see this exact pattern in Software Engineering now. We treat Developer Idle Time as a defect to be eliminated by JIRA tickets, rather than the necessary slack required for thinking.
Ask HN: For those who have been in the industry for 20+ years, do you agree?
Suppose I get a preconfigured VPS with Claude Code, and ask it to make an android port of an app I have built, it will almost always automatically downloads the sdkmanager and accepts the license.
That is the flow that exists many times in its training data (which represents its own interesting wrinkle).
Regardless of what is in the license; I was a bit surprised to see it happen, and I'm sure I won't be the last nor will the android sdk license be the only one.
What is the legal status of an agreement accepted in such a manner - and perhaps more importantly - what ought to be the legal status considering that any position you take will be exploited by bad faith ~~actors~~ agents?
Examples
Simple function:
from polymcp.polymcp_toolkit import expose_tools_http
def add(a: int, b: int) -> int: """Add two numbers""" return a + b
app = expose_tools_http([add], title="Math Tools")
Run with:
uvicorn server_mcp:app --reload
Now add is exposed via MCP and can be called directly by AI agents.
API function:
import requests from polymcp.polymcp_toolkit import expose_tools_http
def get_weather(city: str): """Return current weather data for a city""" response = requests.get(f"https://api.weatherapi.com/v1/current.json?q={city}") return response.json()
app = expose_tools_http([get_weather], title="Weather Tools")
AI agents can call get_weather("London") to get real-time weather data instantly.
Business workflow function:
import pandas as pd from polymcp.polymcp_toolkit import expose_tools_http
def calculate_commissions(sales_data: list[dict]): """Calculate sales commissions from sales data""" df = pd.DataFrame(sales_data) df["commission"] = df["sales_amount"] * 0.05 return df.to_dict(orient="records")
app = expose_tools_http([calculate_commissions], title="Business Tools")
AI agents can now generate commission reports automatically.
Why it matters for companies • Reuse existing code immediately: legacy scripts, internal libraries, APIs. • Automate complex workflows: AI can orchestrate multiple tools reliably. • Plug-and-play: multiple Python functions exposed on the same MCP server. • Reduce development time: no custom wrappers or middleware needed. • Built-in reliability: input/output validation and error handling included.
Polymcp makes Python functions immediately usable by AI agents, standardizing integration across enterprise software.