I did some snooping of the http traffic from the client to the camera, and I now understand why manual focus is nearly unusable. Here's what the web client sends when you depress Focus In, e.g. a MouseDown() event:
[{"cmd":"PtzCtrl","action":0,"param":{"channel":0,"op":"FocusInc","speed":32}}]
and when you release the mouse button, e.g. a MouseUp() event:
[{"cmd":"PtzCtrl","action":0,"param":{"channel":0,"op":"Stop"}}]
The focus motor doesn't increment to a numerical position or by a numerical amount; it goes forward (or backward) until it's explicitly told to Stop, and timing of the Stop is based on a MouseUp() event in the browser. With the inherent latency in the video preview and the high latency of controlling this over the internet, accurate focusing based on visual feedback is virtually impossible.
I've also observed that changing the Speed value in the slider has no apparent effect on Focus and Zoom. The commands appears to change, but the Focus and Zoom motors appear to move just as fast at "speed":1 as they do at "speed":32.
So here's my proposed solution, which should be doable in software:
- Interpret the Speed slider as a duration instead of as a speed indicator
- Accept mouse clicks as discrete events instead of using MouseDown() and MouseUp()
- Set a timer of X milliseconds, based on a factor of the Speed slider value
- Run the selected motor for X milliseconds and then Stop, on each click of the mouse
It wouldn't be as good as a true numerically encoded position, but it would far, far better than what we have now. We'd be able to slowly increment the zoom or focus in or out, until we can visually verify the results we want, and it wouldn't be subject to the latency of the viewer or the connection.
In fact I might even try this from a .js application and see how well it works, since it can be done purely on the client side.