Thursday, February 1, 2024

Yolo AutoCropping Presentation Videos


 

As camera resolutions continue to improve the feasibility of capturing a full scene of a classroom, lecture, presentation hall, or the such and autonomously focusing attention on the presenter becomes more practical.  Generally, a camera operator pans and zooms in on the presenter as they make their way around the stage to draw the audience attention to the intended target.  Professionally filmed videos draw the audiences attention to the speaker and their production quality contributes to a more informative presentation.

Wide-angle, static camera positions are an alternative for capturing presentations but generally fail to draw the audience attention to the speaker.  With robust object-detection, the position of the presenter can be automated and thru the use of auto-cropping the presenter can offer a budget-friendly alternative to more professional video production facilities.


YOLO (You Only Look Once) takes a different approach from classic computer vision by utilizing a classifier as a detector.  Authored by Joseph Redmon at the University of Washington, YOLO sub-samples and image into regions, assumes each region has an object and executes a classifier on each region, then merges the classifier groups into a list of final objects.  

 

Below is a proof-of-concept utilizing YOLO in an auto-cropping manner.  The wide-angle source video is used as input, object-detection is focused on the front of the room, detects the presenter and auto-cropped around the presenter.  Once the presenter location is available, we use a variety of means to 'pan the camera', the first by snapping to the presenter location, the second by smoothing the camera motion by incorporating a 2-dimensional shaper, the third using the shaper but only moving the camera when the presenter nears the edges of the current crop window.

Each mechanism is a rough implementation, focused on rapid proof-of-concept rather than optimal results, but you get the idea.  


The source video was found on here; Minnebar7

 

 



Tuesday, January 30, 2024

Published My First Python Package


 

 

I started dabbling with Python back in 2012'ish, using it pretty regularly over the years but generally keeping my projects close to home.  Recently, I dipped my toe into publishing a Python package, out to the known universe.

Back in the late 90's, the Precambrian Digital Age, I took a couple courses that continue to pique my interest time and time again.  Parallel processing was primarily constrained to supercomputers like the Cray-1 that was homed in a nearby lab on campus, on full display behind a full-glass wall.  A workhorse which eagerly awaited computationally intensive parallelized programs. 

A customized version of Fortran, its vocabulary, the Computer Science department rarely used it, aerospace and atmospheric sciences most heavily used the system.

The second course, Distributed Operating Systems, taken a bit later seemed to pair well with this budding interest in high performance computing.  Beowulf clusters, commodity-grade networked computers running Linux, could be created from RadioShack-provided equipment fueled by inspiration.   Cloud computing, virtual machines and even network-intensive applications hadn't breached the digital horizon, but small-cluster networked labs provided inspiration that one day multitudes of computing assets would one day join hands in forming highly networked, parallel, distributed systems that can be considered common today.

While robust and reliable distributed systems are highly sought after, engineering them is plagued with challenges.  Failed requests could be due to loss of the sent message, the loss of the response, the destination service abruptly terminating, relocation of the service, a over-tasked memory/cpu that slows the response,....or any number of other factors.  Python and ZeroMQ pair well to allow the creation of a distributed system framework which inspired my budding project.

dividere UG

The public project repository is located at:

https://github.com/lipeltgm/dividere


This is my first cut at publishing a python package, I tried to apply good design, test and documentation principles along the way.  One particular challenge I encountered is that the package dependencies require a version of Protobuf that isn't currently available via 'normal channels'.  I'm hoping in time that complication will self-correct when compliant versions become the default.

Until then, it likely will require manual installation of protobuff-v3.19 (or later) before installing via pip3 from pypi:

https://pypi.org/project/dividere/ 

$ pip3 install dividere


With the foundation in place, I'm intending on extending the framework to support more reliable messaging, database components, robust failover detection and recovery.  

More to come in the future, fingers-crossed.


Monday, January 29, 2024

Embarking on Authoring a Computer Science Book

 


Like many folks, I've occasionally been drawn to 'write a book', despite a real need for it.  Mid 2020, I had a couple inspiring computer science majors contact me via Reddit for advice, mostly involving 'what is CS' and 'How do I get into CS', but one exchange left me puzzled.  A young man from Ireland, just leaving high school was accepted into university but later rejected as a result of Covid reduction in campus actions.  He asked what he could do to get a head-start in self-study or prepare for the industry in the event he never gets in.  My suggestion was simply; "go to the university bookstore, find the CS textbooks, buy them and begin self-study".  I heavily encouraged going to university, and stressed that I doubt I'd be a professional in the industry had I not done so, but as a plan-B aligning your self-study with the university curriculum would be better than adhoc YouTube, influencer offerings, or code camps.  

To my surprise, I quickly found I'm out of touch with how universities teach as of late.  Many no longer have physical textbooks, or virtual ones for that matter, instead they teach via interactive websites with automated grading.  Restricted to university students closed the avenue to my suggestion.  So, over a few days I got a bug to write a book, a collaboration of my CS university teachings in a manner I had wished it was presented to me.  Worse case, after spending some time on it, I'd know how lofty an effort.  

On and off, I return to this passion project unsure of it's practicality but it helps rekindle my love for this career.

 

Attached is a snippet of my work in progress, not even titled yet;

WIP


I'd welcome any feedback on the chapter as well as experiences from anyone who has authored and published a book.

Friday, January 26, 2024

C++ Database ORM Project

 


 

Systems that utilize a database benefit from an automated means of translating database CRUD operations to application language.  

Products like 'ObjectStore' aim to provide mechanisms to exchange data to/from applications to the database in a seamless fashion.

This can be done by providing the means of converting C++ objects into SQL query/insert/update commands and converting query responses back into C++ objects.  

My SQL-fu being significantly rusty, I spend a bit of time attempting to create an ORM/MySql project go get a better understanding on how such a product could be created.

If we take the tact of most 'language-independent' products, we can start with a language-independent intermediary language, one that allows us to define the type of objects we wish to store/retrieve from the database.

A db-object file (e.g. MyDb.odb) can specify a database object as follows:

dbclass MyRecord002
  float val01 as key;
end;

This odb file can then be pre-processed, creating a language-dependent library  components (MyDb.h, MyDb.cpp) which can then be used by applications directly.  Updates to the library component can automatically be applied in the database.  The linked association between the C++ object and the database are enforced by constructors and access methods.

A bit of proof-of-concept at this time, works for a handful of data types {int, float, long, text, char(X)}, soon to add date/time.  At this time, constrained to 'flat' datatypes, but *fingers-crossed* to work for nested datatypes in the future.

https://github.com/fsk-software/pub/tree/master/DbObjOverlay

Recently came across Wt::Dbo reference as well, haven't had a chance to take a look yet; https://www.webtoolkit.eu/wt/doc/tutorial/dbo.html