I’ve been interested in Python for a while now. One thing i like is the possibility to do face recognition. Made a python script based upon a youtube clip from sentdex. The script worked fine and after some tests i was confident that i could extend the original script to what i want it to do.

The main thing that annoyed me was that every time you run the script, the scripts went through the same intensive process of checking if there is a face, encoding this face to a 128 value variable type that make up this face. The 128 value variable (“encoded face”) didn’t change between run 1, run 2 and run 100. Only if you would change something to the image, maybe the library (for better processing), that could be a reason to recheck the image.

One other thing that annoyed me was the size of the files. When the files have been processed and you have the “face variable”, there is no need to store those files anymore. because the information that we need (and have), is much smaller then the file.

A couple of hours later i just couldn’t find any reliable and usable info about how that data is represented. There is a lot of information what the data means, but (at least) i couldn’t find the correct way to store the data in a MariaDB database. Well, then start from the beginning: What is the data type that comes out of face_recognition.face_encodings?

First start at the beginning:
Let’s find what the type is:

encoding = face_recognition.face_encodings(image)[0]
print(type(encoding))

This results in the output:

<class 'numpy.ndarray'>

After a lot of searching (again), there was not an easy solution:

  • sqlite3 has an array data type, which MariaDB doesn’t have.
  • You can use pickle to convert the data something to store in the database, but i just couldn’t get it working with a normal insert command
  • Converting it to base64 gave me the same issue as pickle.

Eventually i found the solution to my misery: it was as, but it took me a while: str(encoding.tolist())

encoding = face_recognition.face_encodings(image)[0]
encodinglist = str(encoding.tolist())
print(type(encodinglist))

This gave me the following output: <class ‘str’>

Well, strings is something MariaDB can work with. Created a column as varchar(4096) and voila the data can be stored.
Well, the 3 part process is now for 2/3 done (getting info and storing info) now we need to get the data back. The retrieval is very straight forward, select the correct columns from the table and storing it into a array again.

Because the data is now stored as a string and not an array, the face_recognition.compare_faces process can’t do anything with it. To convert the string back to an array is very easy:

KNOWN_FACES.append(eval(row[1]))

The eval part converts the string back to an array and puts that array in the KNOWN_FACES array.

So now after processing the images the main info is stored in the database and you can throw away the images if you want, or not, do what you like.