Sunday, December 29, 2019

YOLO - Computer Vision

I recently stumbled upon the You Only Look Once (YOLO) computer vision algorithm that shows some remarkable results.  This post will focus on a brief introduction to this system and some examples of use in the limited time I've spent with it as of recent.

YOLO takes the stance of using a classifier as a detector.  In short, the algorithm takes the path of splitting a frame into SxS subimages and processes each subimage under the premise that it has an object within it, centered in the subimage.  It then performs some image processing to determine a series of bounding boxes of interest, then runs classifiers on each of the bounding boxes.  The classifier returns a confidence metric for each classifier, say 0-100.  So, suppose you have a bounding box that contains a dog, the algorithm would run a 'cat' classifier on the bounding box and get a low confidence score, it'd then run a 'bowling ball' classifier and also get a low score,...., then run a 'dog' classifier and get a high score.  The subimage tile would then be tagged as having a dog in it.  The algorithm is based on each subimage tile having no more than one object within it.  Highest confidence metric wins.

The rest of this post will focus on quickly setting up YOLO and running it on a series of test images.  Essentially, 3 steps: 1) download and install darknet (open-source neural net), 2) download neural net YOLO weights, 3) run YOLO on a series of images.  Let's get started.

Install Darknet

$ git clone https://github.com/pjreddie/darknet
Cloning into 'darknet'...
remote: Enumerating objects: 5901, done.
remote: Total 5901 (delta 0), reused 0 (delta 0), pack-reused 5901
Receiving objects: 100% (5901/5901), 6.16 MiB | 4.44 MiB/s, done.
Resolving deltas: 100% (3915/3915), done.
Checking connectivity... done.
$ cd darknet; make
...

Download YOLO Weights

$ wget https://pjreddie.com/media/files/yolov3.weights -O darknet/yolov3.weights

Run on Images

$ cd darknet
$ ./darknet detect cfg/yolov3.cfg yolov3.weights ~/Photos/image01.jpg
$ display predictions.jpg

The predictions image will surround detected images with bounding boxes and a label, like this:


Running YOLO on the above photo will result in the output and predictions image; 
/home/lipeltgm/Downloads/nature-cats-dogs_t800.jpg: Predicted in 76.252667 seconds.
dog: 95%
cat: 94%
person: 99%
person: 99%

YOLO found 4 objects, with high confidence for each: 1 cat, 1 dog and two people;

Running on my existing personal photos (~6400 images) and adhoc reviewing the results looks extremely promising; results follow:

Without any pre-processing or prep, I ran the YOLO classifier at my personal archive of photos, some 6400 images of vacations, camping trips, weddings,....  This process took a couple days, launching the darknet detect process individually for each photo, as a result the weights were loaded for each photo that significantly slowed the process, but wasn't really interested in performance as in the detections themselves.

Here is the types of objects found in my photos:
lipeltgm@kaylee:~$ grep "^.*:" ./blog/YOLO/darknet/bigrun.log | grep -v Predic | cut -f 1 -d ':' | sort | uniq -c | sort -n
      1 baseball glove
      1 broccoli
      1 hot dog
      1 kite
      1 scissors
      2 donut
      2 mouse
      2 parking meter
      3 apple
      3 banana
      3 orange
      3 pizza
      3 skateboard
      3 snowboard
      3 zebra
      4 sandwich
      4 toothbrush
      5 train
      6 bus
      6 skis
      6 stop sign
      7 baseball bat
      7 fork
      8 giraffe
      8 toilet
      9 cow
      9 knife
      9 microwave
      9 spoon
      9 surfboard
     10 frisbee
     10 remote
     12 tennis racket
     14 aeroplane
     15 elephant
     16 oven
     18 motorbike
     18 sink
     20 wine glass
     21 vase
     22 fire hydrant
     26 bicycle
     26 sheep
     29 cake
     31 refrigerator
     34 cat
     34 suitcase
     36 teddy bear
     38 sports ball
     40 horse
     43 laptop
     49 cell phone
     54 traffic light
     64 bear
     70 bowl
     77 bed
     77 clock
     88 pottedplant
    103 bird
    106 backpack
    112 handbag
    124 sofa
    124 tvmonitor
    129 umbrella
    156 dog
    170 bottle
    187 book
    194 diningtable
    204 tie
    237 bench
    275 truck
    304 cup
    355 boat
   1029 chair
   1333 car
  14683 person


Gotta say, pretty cool and located a number of random objects I didn't realized I had photos of.  Who knew I had a photo of zebras, but in-fact I really do.  DisneyWorld is amazing:


Have fun with it!!


Sunday, December 22, 2019

My Journey with Computer Vision


The spring of 1997 was a particularly interesting semester in my academic career, I was immersed in two challenging yet complimentary classes; Computer Graphics and Computer Vision.  This was my first introduction to the concept of computer vision and while I'm far from an authority, I do have a recurring history of dabbling in it ever since.  This week I stumbled upon a new'ish object detection algorithm and once again the computer vision mistress has set its seductive grip on me.

This new'ish algorithm will be a focus of a future post, in the meantime I wanted to spend some time pondering on the general topic of computer vision, consider it a retrospective of what I learned that semester, the change of focus in the technology since and things I wish I knew in college on the subject.

In the 90's the subject of computer vision was heavily based on simple image processing.  Simple may be misleading and in no way is meant to be condescending or judgmental, rather, simple in terms of achievable algorithms given the constraints of the processing power of the era.

At the core of the course was this book;
http://www.cse.usf.edu/~r1k/MachineVisionBook/MachineVision.pdf

I include it as it sets the stage for the state of the discipline at the time.  In that era, the state of computer vision was mostly image processing with a concentration on finding object silhouettes and features followed by trying to match the silhouette to a known 'good'.   This two-phased approach ( detecingt features and comparing features) continues to be at the core of vision systems.  At the time feature detection was in the forefront with limited understanding of how to effectively compare the found features.  I'd argue that the era was primarily video/image processing rather than what we've grown to know as computer vision.  The discipline was in it's primordial stage of evolution, feature detection needed to be solved before classification and again the resources of the time were less bountiful as we have by today's standards. 

So, followers of computer vision concentrated on image/video processing fundamentals.  We searched for ways to process an image pixel and draw relationships between the connectivity of each pixel.  We implemented various means of thresholding and a variety of filters with the objective of generating meaningful binary images or grayscale models.  Binary and/or grayscale models in hand you were met with an unsatisfying cliff-hanger, much like the ending of Sopranos, simply because the development of classification mechanisms was just beginning.

In the introduction, we've arrived at the topic of retrospective; I wish I had understood the *true* reason there was such a focus on image processing because that revolutionized the course of the discipline.

Take this furry little buddy;
The course was primarily focused on generating something like this;

Something readily done today by ImageMagick;
$ convert ~/Downloads/download.jpg -canny 0x1+10%+30% /tmp/foo.jpg 

Take a minute and look at the binary image above and ask yourself....what is the purpose of that image?  Really....take a minute.....I'll wait.

If you said "to get a series of lines/features/silhouettes that represent a cat" then you'd be in lock-step with the discipline at the time.  You'd focus on generating a series of models representing a cat, take that series of pixels and find a way to calculate a confidence metric that it's truly a cat.

What if you took the same approach with this image;


A wee bit tougher now?  But that's where an alternative to 'why we look for lines/features/silhouettes' propelled the course of computer vision.  The features could tell you where to look and this revolutionized the study.  The traditional process was detection => classification, but someone what if you viewed classification as detection?  What if we could simplify the group of cats into a series of cropped images each with one cat and ran a classifier on each subimage?

Take another look at the first binary image of the cat again.  Draw a bounding box around the lines and what you have is an area to concentrate your attention on.  Looking at the top right of the image will get you precisely squat, the bounding box tells you where you should concentrate your computer vision algorithms on.  Same goes to the groups of cats, with an intelligent means of grouping, you can distinguish the 4 regions each containing a cat.  Run you classifier on each region and you're more likely to detect the presence of a cat.

The computer vision algorithms evolved into a slightly different process: 1) define a series of bounding boxes, 2) run a classifier on each box.  

A future post will focus on the YOLO (You Only Look Once) algorithm that is based on this idea.  While the concept of a classifier based detection system pre-dates YOLO, the paper made it clear that the industry had changed and I had not been aware.

Cheers


Sunday, December 15, 2019

FFMpeg Transitions -- Part 4



The past few posts have been focusing on dynamic filter transition effects.  This post will bookend this series.  Let's finish up.

Circular Transitions

Let's dust off our high school geometry notes; we can calculate the distance from position (x,y) from the center of the image (W/2, H/2) using the following:

If we compare this distance to a linearly increasing range, based on time we can apply a increasing/decreasing circular boundary.  In this case, we are utilizing max(W,H) to ensure the circular geometry finalizes with the dimensions of the video (width and/or height);



$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(gte(sqrt((X-W/2)*(X-W/2)+(H/2-Y)*(H/2-Y)),(T/1*max(W,H))),A,B)'" circleOpen.mp4


$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(lte(sqrt((X-W/2)*(X-W/2)+(H/2-Y)*(H/2-Y)),(max(W,H)-(T/1*max(W,H)))),A,B)'" circleClose.mp4


Expanding Rectangular Window

Compounding on the horizontal and vertical opening effects we can get an expanding window effect;

$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(between(X,(W/2-T/1*W/2),(W/2+T/1*W/2))*between(Y,(H/2-T/1*H/2),(H/2+T/1*H/2)),B,A)'" expandingWindow.mp4

While there are countless other possible effects, this is a decent crack at a representative sampling of such effects.

Whelp, that's it; the knowledge bank is empty.  Hope you found this series of posts of some value, perhaps just a momentary read from your palace of solitude.

Cheers.

Saturday, December 7, 2019

FFMpeg Transitions -- Part 3

Our third post concerning video transitions.  Be sure to read our previous posts, I suggest reading them once for knowledge, a second time purely for fun, and periodically thereafter for continued inspiration.

The previous posts focused primarily on the overlay filter.  This post will focus on the blend filter applied dynamically with respect to time or position.

Cross Fade Effect

The blend filter is most typically used in cross fading from one video to another.  Briefly discussed in the first post, the general idea is to provide the filter with two video frames and a fractional value to be applied to each.  For example, a 50/50 split will give an equal weight to each video.  These weights can dynamically change with respect to time.  Cross-fading takes the effect begins by applying a weight of 1.0 to the first video, 0.0 to the second video, then linearly decreasing the weight of the first video while simultaneously increasing the weight of the second.  Finally, ending with a weight of 0.0 for the first video, 1.0 for the second.  Easy, Peasy.

Let's take a look at the full filter example;

$ ffmpeg -i image01.mp4 -i image02.mp4  -filter_complex "[0:v][1:v]blend=all_expr='A*(1-min(T/1,1))+B*(min(T/1,1))'" blend.mp4


Note; like past posts, the denominator in the (T/1) implies that the transition will take 1 second.  Playing with that value will speed up or slow down the morphing.

Location-Based Blend

The previous blend filter is applied uniformly to the entire frame.  Using conditionals and (x,y) locations we can base the blending factors on position.  Consider a simple blend condition where the left-most half of the image uses the first video, the right-most half uses the second video.  Conceptually, the blend would look something like this;
'if(lte(X,H/2),A,B)'

When X is less than the middle of the video frame vertical center, apply A, otherwise B. 

This example, not particularly useful, but imagine applying a moving range rather than simply the middle of the vertical axis.  As that range moved from bottom-to-top the effect would emulate a rising curtain;


$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(lte(Y,(H-T/1*H)),A,B)'" curtainUp.mp4

Notice rather than (H/2), the range starts at H and progresses to 0 linearly during the 1 second duration.

Similarly, you can perform a curtain down effect like this;
$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(gte(Y,(T/1*H)),A,B)'" curtainDown.mp4



Using the center point as a start or end point and mirroring the effect on each 1/2 of the frame can open the door to other effects, like;
$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(between(X,(W/2-T/1*W/2),(W/2+T/1*W/2)),B,A)'" verticalOpen.mp4


$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]blend=all_expr='if(between(Y,(H/2-T/1*H/2),(H/2+T/1*H/2)),B,A)'" horizontalOpen.mp4


Sunday, December 1, 2019

FFMpeg Transitions -- Part 2


This post continues on from last weeks post.  Be sure to read the past post to establish a foundation for the content here.

Our journey will continue to focus on creating scene transitions of the form;

In this post we will focus on wipe transitions, conducted by applying an overlay to a dynamic position.  

We are going to focus on creating wipe transitions; up, down, left, right and the diagonals.  

In general, these wipe transitions are done by applying an overlay on a time-based position.  We will begin by focusing on left/right/up/down which can then be applied to create the diagonals.

Wipe Right

$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='min(0,-W+(t/1)*W)':y=0[out]" -map "[out]" -y wipeRight.mp4

Wipe Left

$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='max(0,W-(t/1)*W)':y=0[out]" -map "[out]" -y wipeLeft.mp4

Wipe Down

$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='0':y='min(0,-H+(t/1)*H)'[out]" -map "[out]" -y wipeDown.mp4

Wipe Up

$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='0':y='max(0,H-(t/1)*H)'[out]" -map "[out]" -y wipeUp.mp4

Diagonals

Now that we have the equations to manipulate the X or Y locations, the diagonals are simply created by applying position changes to X and Y.

$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]overlay=x='min(0,-W+(t/1)*W)':y='min(0,-H+(t/1)*H)'[out]" -map "[out]" -y wipeRightDown.mp4
$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]overlay=x='max(0,W-(t/1)*W)':y='min(0,-H+(t/1)*H)'[out]" -map "[out]" -y wipeLeftDown.mp4
$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]overlay=x='max(0,W-(t/1)*W)':y='max(0,H-(t/1)*H)'[out]" -map "[out]" -y wipeLeftUp.mp4
$ ffmpeg -i image02.mp4 -i image01.mp4 -filter_complex "[0:v][1:v]overlay=x='min(0,-W+(t/1)*W)':y='max(0,H-(t/1)*H)'[out]" -map "[out]" -y wipeRightUp.mp4

Next week we will spend some time on the blend filter.

Cheers

Sunday, November 24, 2019

FFMpeg Transitions -- Part 1


As I look back at the content of this blog a good chunk of the posts revolve around FFMpeg, roughly 1/4 of them to-date.  I've been playing with FFMpeg for years, mostly out of curiosity as I lack the creativity or ambition to create any significant media content.  I think I'm drawn to this topic for a few reasons; 1) I had a need to use it out of necessity some years ago to transcode videos for image detection software, 2) it's extremely powerful and is a tribute to the sophisticated command-line utilities that Unix emphasizes, 3) despite it's power it's documentation is IMO lacking so I feel obligated to a degree to document what I discover.

This form of topic fits well into my goal of this blog as well, short consumable content that can be authored in spare hours of the evening.  I'm intending departing from this format temporarily however and author a multi-part series on video transition effects to give it sufficient attention and hopefully shed some light on the underlying mechanics to create these effects.

Let's skip to the end temporarily and look at the end result we will be shooting for.  We will use two input videos of the legendary duo and perform transitions between the two using various transition effects.  Despite these video being of a static image, the same effects can be used for dynamic videos.
So, that's our bogie, we will tackle each effect one-by-one in this and future posts.

Fundamentals

The effects all build on some FFMpeg and mathematical fundamentals, we will try to set the stage in this section.

Two key FFMpeg filters enable these transitions; overlay and blend.  

The overlay filter allows placing a video or image atop another and can look something like this;

We create this effect by instructing FFMpeg to first render the first image, then render the second image pinned at position (x,y).  Altering the start (x,y) position of the 2nd image can give the appearance of the image moving.

The blend filter allows blending two images or videos into a output video giving the appearance of one video melting and morphing into the other; looking something like this;

This effect is created by instructing FFMpeg to render each pixel as a composite of both input videos.  For example, we can tell FFMpeg to apply 50% of the first video and 50% of the second video to get something that looks like the above.  Similar in nature to the effect you'd get if you printed out the frames on transparencies (e.g. clear plastic), lined atop one another and held up to a strong light.

These two filters will be the basis of the future sections so we'll spend a bit more time with them here.

We need to start with two input videos, they need to be of the same dimensions and resolutions and helpful if they are of the same duration.


image01.mp4



image02.mp4

An example usage of the overlay filter takes the form:
$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='W/4':y=0[out]" -map "[out]" -y example01.mp4

In short, the above command says; render a image01.mp4 frame, then overlay the subsequent image02.mp4 frame at (x=width/4, y=0).



An example of the blend filter takes the form:
$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]blend=all_expr='A*(0.5)+B*(0.5)'" example02.mp4

In short, the above command says to blend 50% of the 1st video frame with 50% of the 2nd video frame; looking like this:

Future video transitions will make heavy use of these filters, but instead of using static values they will/may be based on time and/or position making for more sophisticated effects.  The FFMpeg filters can make use of internal filter variables and functions.  Let's consider ones that we may make use of:

Variables

W -- width of video
H -- height of video
X -- pixel x location
Y -- pixel y location
t/T -- time
Note: variables are filter specific; for example the time variable differs between the blend and overlay filter

Functions

between(i,min,max) -- return 1 if min <= i <= max
lte(i,j) -- less than equal to
gte(i,j) -- greater than equal to
sqrt(i) -- square root

Time-Based Expressions

Applying time-based expressions open a door to a number of sophisticated effects.  Suppose we use the following equation:
The T variable is an internally available variable, T0 is the duration of the video.  The result will be a fraction between 0.0 and 1.0.  This concept can be used to apply dynamic filters.  

For example, a filter x position can be applied to a filter using the following equation.  The result; x will begin at 0, ending at the width of the video, linearly traversing across the screen in T0 seconds.

Wipe Right Transition

$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='min(0,-W+(t/1)*W)':y=0[out]" -map "[out]" -y wipeRight.mp4

In the above example we are making use of the previous time-based location equation with a minor tweak; specifically the use of the min() function.

We start the overlay far left, outside the rendering dimensions; specifically -W, then slowly move the x start location right-wise.  The min() function is used to prevent the x overlay start position to exceed 0, snapping the final location directly over-top the lower image.  Also of note, T0 in this case is defined as 1 second.  This means the wipe effect completes within 1 sec.  Tweaking that value will speed up or slow down the motion.

Wipe Left Transition

Let's reverse the effect, moving the overlay from right-to-left;
$ ffmpeg -i image01.mp4 -i image02.mp4 -filter_complex "[0:v][1:v]overlay=x='max(0,W-(t/1)*W)':y=0[out]" -map "[out]" -y wipeLeft.mp4

Similarly natured equation as above, only starting at W and moving left.  The max() halts the position at its final location.

A reasonable stopping point, we will put a pin in it for now and follow on with additional effects in later posts.

Cheers.



Sunday, November 17, 2019

Making Use Of Idle Workstation


Speaking of lowered expectations; here's something you may really like.....or not.

Have you ever given thought to the amount of computational horsepower your machine packs and how much of it is wasted when not in use? I thought it would be useful to write a script that could interrogate the current state of a Linux workstation to determine if it was available to run some background activities, similar to SETI desktop behavior. 

Seems determining if the screensaver is active is a pretty good indicator of whether a user is currently using the system. Bundle that with evaluation of the system load and you can get a pretty good indication of an idle system. 

Following is a Tcl-script that determines if the system screen saver is active. One could imaging bundling this will suspending/resuming a process or virtual machine to make use of the idle time. 

Let me know what you think and usages you may think of. 



#!/usr/bin/tclsh



proc isScreenSaverRunning { } {

  set response [exec gnome-screensaver-command -q]

# puts $response

  foreach e [split $response \n] {

    switch -glob -- $e {

      "* inactive"

       {

         set running 0

       }

      "* active"

       {

         set running 1

       }

      default

       {

       }

    }

  }

  return $running

}



proc check { } {

  set running [isScreenSaverRunning]

  if { $running } {

    puts "screen saver running"

    exit

  }

  after 1000 [list check]

}



#-----main------

after 1000 [list check]

vwait forever

Enjoy!

Monday, November 11, 2019

Non-Rectangular Video Cropping



I've come across a few posts asking how/if you can perform a circular crop on a video.  The focus of this post will be to demonstrate the effect of a circular crop.  In all fairness, this method will give the effect of a crop by applying a transparent mask overlay image.

Let's fire through a circular crop example; a three-step process:
1) create a two-tone image (black/white) image with the same dimension as the video file
2) convert the two-tone image to a transparent image where the view plane is transparent
3) apply the image overlay to the video

Create Two-Tone Image mask

$ convert -size 1280x720 xc:black -fill white -strokewidth 2 -draw 'circle 640,360 320,360' mask.png

Convert to Transparent View Image

$ convert mask.png -fuzz 10% -transparent white mask.png
Bear with me, the following image differs from the above, the inner circle is transparent but the blog background is white giving the appearance of the same image above.


Apply Mask Overlay To Video

$ ffmpeg -i input.mp4 -vf "movie=mask.png [watermark]; [in] [watermark] overlay=0:0 [out]" -strict -2 output.mp4

This final step gives the result;

Sunday, November 3, 2019

What's Your Superpower?


While we may spend most of our day as mild mannered Shoeshine Boy, each and every one of us should be able to put on our cape and become Underdog when the need calls for it.  In time, experience begins carving out our very own toolkit, one that we have honed over the years and we alone can wield with mastery.  Our very own superpower.

On average, we are all average but we each have our very own areas of expertise; things we perform more proficiently than most, perhaps more proficiently than anyone else.  Perhaps you pick up new technology faster than your peers, perhaps refactoring is your bag, or perhaps you are wield Python like a katana, slicing and dicing data like Julie Childs diced onions.

Sometimes these superpowers are developed as part of your job and at the time you may very well use them with average skills.  As time passes on, your peers disperse and these tools become increasingly rare in your newly founded teams.  Once average, you become exceptional at this particular skill.

Lately, I find I've dusted off a once average GDB debugging skill, one that nearly every one of my peers were proficient in nearly 20 years ago, but lately seems to be a hard-to-find skill in recent company.  Pairing GDB with Python I've recently demonstrated proficiency in generating multi-process message-trace-diagrams to gain significant understanding of how legacy systems behaves.  To the astonishment of my boss, it appears that this is my superpower given none of the remaining team has proficiency in such tooling.

So, what's your superpower?  Find it, hone it and wield it....it'll make you extraordinary.

Sunday, October 27, 2019

Making Slideshows w/FFMpeg

Browsing Reddit or StackOverflow and you come across numerous posts asking how to generate slideshows with FFMpeg.  Seems a common interest and incredibly easy with the right tools.  FFMpeg is one of the said tools,the other....makefiles.

There are few utilities that are truly hated in this industry, makefiles seem to be one of them.  Despite it's poor reputation, make is a perfect utility for many image and video processing pipelines.  I blame it's bad reputation on the hacks that claim to maintain them.  Make is the perfect utility, it's dependency engine is a perfect match for creating videos from input media.

This short little makefile will convert all JPGs in the current directory into subvideos of 1 sec durations, then concat them together into a long video and finally adding audio.

$ cat Makefile 
SRCS=${wildcard *.jpg}
SIZES=$(subst .jpg,.jpg.size,${SRCS})
CLIPS=$(subst .jpg,.mp4,${SRCS})
SlideDuration=1
FPS=25

all: video.mp4

video.mp4: slideshow.mp4 audio.aac
${SH} ffmpeg -i slideshow.mp4 -i audio.aac -shortest -strict -2 $@

audio.aac:
${SH} youtube-dl -f best https://www.youtube.com/watch?v=IYbE2coMZPc
${SH} ffmpeg -i Íslandsklukkur\ \(Instrumental\ Icelandic\ Folk\ Music\)-IYbE2coMZPc.mp4 -vn -codec copy audio.aac

slideshow.mp4: ${CLIPS}
${SH} for f in `echo $^`; do echo "file '$$f'" >> filelist.txt; done;
${SH} ffmpeg -y -f concat -i filelist.txt -codec copy $@
${RM} filelist.txt

%.mp4: %.jpg video.size background.jpg
${SH} ffmpeg -y -loop 1 -i background.jpg -i $< -filter_complex "overlay=(main_w-overlay_w)/2:(main_h-overlay_h)/2" -r $(FPS) -vframes $(shell echo $(FPS)*$(SlideDuration) | bc) -an $@

video.size: ${SIZES}
${SH} echo $(shell cat *.size | cut -f 1 -d 'x' | sort -un | tail -1)x$(shell cat *.size | cut -f 2 -d 'x' | sort -un | tail -1) > $@

%.jpg.size: %.jpg
${SH} mogrify -auto-orient $<
${SH} identify $< | cut -f 3 -d ' ' > $@

background.jpg: video.size
${SH} convert -size $(shell cat $<) xc:black $@

clean: 
${RM} *.mp4 *.size background.jpg *.aac

This makefile works for an arbitrary list of image sizes, it first auto-orients images (portrait/landscape) as it's not uncommon to have phone images mis-oriented.  It then generates a *.jpg.size file that contains the image dimensions.  These files are used next by the video.size target, which generates the maximum image size...this allows creating the video to match the largest of the images.  Next, a background jpg image is created of said dimensions.  Next, each image is converted into a 1 sec video which is then concatenated into the slideshow.mp4 file.  Finally, an audio video is created by downloading a video from Youtube and the audio is extracted, then applied to the slideshow mp4 file to get the final video.

I dumped our Iceland photo archive into the directory to test and it spit out the following video;


Sunday, October 20, 2019

Fantasy Engineering Team

I’ve been fortunate to have worked with some incredible engineers in my career.  Every so often my lovely wife will buy a PowerBall ticket that incites thoughts of ‘what would you do if you won’?  My answer has been consistent since college; I’d build something cool the way I felt it should be built.

The world is about compromise, often motivated by constraints.  Projects typically have limited time, limited budget and limited availability of team members.  Since college, I’ve always wanted to build something cool in the way I felt it should be done rather than how it has to be done due to pre-defined constraints.  So, it’s always fun to perform the mental exercise of “what if you could execute a software development project the way you think it should be done?”

In past years, I’d first focus on the what; what is the project?  Lately I tend not to think of that as the first step, I am still a tech nerd and interesting projects are still a heavy influence but I don’t think of the what as much as I think of the who.  The reason; there are nearly infinite interesting projects and technologies, something interesting is always around the corner.  Instead, I focus on the people I’d want to do it with.

So, I figured I’d play out my top 6 draft picks; software engineers I’d immediately call up if I ever had the ability to cherry-pick a software development team to work on some new sexy project.  The if I had a $100 million dollars project budget development team.  Unless we have run in the same circles, many of these folks you’ll unlikely know.  They are primarily located in Minnesota and folks I’ve had the pleasure to work with.  For those of you that don’t recognize these folks, I’d suggest you focus on the why’s rather than the who’s; why would I pick this superstar over the dozens and dozens and dozens of other folks I’ve worked for.  

One last thing; I’ve intentionally excluded my wife, who is an extraordinary software engineer, simply because of an oath.  I’ve made two oaths in my life: 1) the oath to her in marriage, 2) the oath to never work with one another on the same project again.  We worked on the same team in the days my hair was much fuller and less gray and it was an incredible strain on our relationship.  The lack of separating home live and work life is an ugly monster and while I’ve matured since those days it’s simply not worth the risk of repeat.

My dream team list, in no particular order;

Marshall Meier

I’ve known Marshall for nearly my entire professional life, and most of my college days.  From ‘go’ it was clear that this man was destined for greatness, one of the most organized and even-tempered men you’ll ever meet.  His ambition and technical prowess had drawn him into the consulting community early where he fine-tuned his professional communication skills and honed his breadth of technical expertise.  His continued pursuit of knowledge has earned him multiple degrees, many of which were earned while working and balancing a full family life.  Any good team needs a calming voice and technical leadership during the monsoon of challenges every worthwhile project will encounter.  The kind of guy that can talk you off the edge when you’re prepared to abandon ship.  The type of guy that is well-versed in how to make a happy, productive, engaged team and how to maintain it.  The kind of guy that has the technical chops to bounce ideas off of as well as the kind of guy that can/will roll up his sleeves and put in the work.  Great engineering teams need strong technical expertise as well as people with proficiency in building effective working relations.

Dan Ruhland

Dan started his professional life as a controls engineer and was remarkable at it.  I first met Dan when he was responsible for writing the controls software for a 7-axis ammunition loading system.   He authored the control loop algorithms for each axis and authoring sequencing logic to transfer ammo to/from the breech while avoiding slamming one-of-a-kind equipment into one another; arguably, the most sophisticated subsystem of the weapons system.  From our earliest encounters it was clear that Dan was an expert in his field; disciplined, precise and wicked smart.  Dan made the transition to software development in later years and brought with him a vast amount of technical knowledge of the system as well as all the professional characteristics that made him a great controls engineer.  While our profession is no stranger to team members from other engineering disciplines (e.g. EE’s, physics,…) it’s my opinion that many while great at low-level software development have difficulty being equally proficient with higher-level development activities such as abstraction, design-patterns, best practices… Dan showed no such limitation.  I feel he’s as proficient a high-level designer as he is at the hardware abstraction layer.  So, why specifically Dan?  Simply because I’m confident that regardless of the complexity of the problem to be solved he’s got the intellectual horsepower and tenacity to devise a solution.  Dan would be my ‘ace in the hole’ for any math-intensive problems/solutions, regardless of complexity.


Matt Spencer

Smart, motivated and quick learner; three of the key traits that come to mind when I think of Matt.  I’ve personally airdropped Matt into technical domains with no available support on more occasions that I care to admit.  Each and every time, Matt chisels his way through the jungle of tech and solves the problem; each and every time.  A jack of all trades as well as an expert in many, he’s a legend in the community.  Engineers often are like engines; high-performance engines require high-maintenance, many high-performance engineers come with higher maintenance traits (e.g. impatience, peculiarities, temperament,…).  Matt demonstrates none of these expected, all the performance with none of the maintenance.  Likely one of the quickest learners I’ve ever had the pleasure to work with, if it can be done likely Matt can do it and probably faster than you thought possible.  He is truly is in a category of one.


Rob Rapheal

Some say that you get along with people that possess traits you either share or wish you had.  Rob has an overwhelming desire for new technologies and it’s apparent from how he spends his free time.  He’s a tinkerer and continuously pursuing learning new technologies in his spare time hoping to one day apply them to a corporate project.  He’s authored an Arduino-based temperature control system for his hobby kiln…….just for fun.   He once spent significant hours pursuing C#/.NET/WPF as emerging technologies while the corporation was focusing on MFC-based applications.  His gamble payed off when a greenfield development activity was materializing and his knowledge of C#/WPF helped influence the product development.  In this profession you need folks that have an eye on the road as well as an eye on the horizon;  Rob provides both.  His talents have been recognized by his peers, whom in the past had informally selected him as the lead designer/architect which was later formalized.





Cliff Winkel

Cliff possesses an indescribable energy and enthusiasm.  Like a squirrel jacked up on Red Bull, this guy can pound out code at an extraordinary rate.  In this profession, we all ‘get in the zone’ on occasion, often through self-medicated caffeine ingestion.  Head-to-head with this guy and you’ll likely fall short.  Cliff worked on a multi-year green-field project with 2-week sprints and consistently completed his tasks regardless of what you threw at him.  Hitting targets is easy when you lay-up, but this guy hungered for work and always chose challenging task loads and never disappointed.  A wicked hard worker, enthusiastic and great to work with.



David Englund

You like to think that greatness in an individual can be recognized early.  David is not as long-in-the-tooth as many of the rest of my fantasy engineering team members.  Hungry, driven, self-learner and an eagerness of understanding “how should it be done”.  His ambition drew him into the consulting profession early and when I was once asked by a contract house “He’s pretty young to be asking for such a rate”, I simply responded “His knowledge is beyond that of his years, just get him in an interview and that’ll become evident”.  David shares many of the same qualities as the rest of my fantasy team and is still in the beginning stage of his career.  I’m eager to see what he’ll accomplish in the next decade.



So that’s it, my fantasy engineering team; many talented alternates but I felt it best to keep the list short'ish.  I’d stack this team against any other development squad in the world and stake a significant wager on their victory.  They tend to all have the types of qualities I’d want in a team: smart, driven, low-maintenance, self-learners, and high producers.  The kind of folks you’d equally want to hang out with as well as stake your project success on. 

If PowerBall ever hits for me I’ll be calling in the draft.

Cheers.