5 Tips for more effective logging

Logging is a critical part of every serious project. If logging is not important in your project – you’re probably doing logging wrong. Here are a few lessons I learned over the years running multiple projects.

1 – Reserve ERROR for errors, and everything that is not a bug in your code shouldn’t be an ERROR

Every log line has a log level. The most important distinction in log levels is between ERROR and everything below ERROR. The following logic should guide you – an ERROR log line should indicate a bug in your code. If there’s an event that generates an ERROR log line which is not an indication of a bug in your code – this should not be logged as an error.

Furthermore, you should spend effort making sure that every recognizable error should be logged as such. So, most handlers should be generically wrapped by an appropriate logger, and your 500 logger or equivalent should naturally emit ERROR logs.

2 – User input validation failures should be warnings

As a natural result of our first tip, user validation failures shouldn’t be logged as errors. They are your code doing what it should be doing. However, they still merit more than an INFO log line. So use WARNING here. Other events can also be logged as WARNINGs, events such as resources running low, a fallback being used, etc.

As a natural outcome of the first two tips, we come to tip no. 3:

3 – Alert on errors, or on multiple warnings

So, your log-levels are now correct. The next step is getting notified whenever an error happens – this is an indication you have a bug in your code. But you don’t want the same error happening a lot to flood your inbox (or whatever other reporting mechanism you use.)

You can de-dup your errors yourself, for example, by hashing the call stack. Alternatively, use a service such as sentry.io to do that for you. You can now send notifications such as E-Mails and text messages when new errors appear.

Once you have that, you can also consider getting alerts for warnings that happen too many times. For example – if a particular user input validation fails often then perhaps your UX is broken. If a fallback happens too many times then perhaps your main flow is not robust enough.

4 – Make your logs informative

Be liberal with adding info logs. At the least, all cross-service and requests to your API should be logged. Other major events/decisions should probably also be logged. Personally I’d probably prefer O(1) per call to my API (i.e. don’t INFO log in a loop).

Independently of that, take care to include all the useful information you can in your logs. That includes file, line, perhaps all or part of your stack trace, and so on. The text logged should also be informative – if a particular value is incorrect log it and the desired value (be careful of privacy concerns though!)

5- Aggregate all logs into a single searchable database

Having a single, searchable log interface, instead of separate ones is critical. Being able to understand the complete flow of an issue is in many cases dependent on you seeing all the relevant information together. Having it searchable will greatly speed up your ability to find issues and fix them. Today at Flytrex we are using logz.io, but there are quite a few other effective solutions.

Bonus section

  • If your project involves two or more people, decide on a logging policy explicitly.
  • There’s a big difference between logging in libraries, tools that run once, or long-running programs. Each one has different needs.
  • For cases when your logs are not perfect (and they never are), a tool such as rookout is very useful. It allows you to set a “logging breakpoint” anywhere in your code – without redeploying it. This already saved me hours of debugging.

Photo credit: Wood photo created by onlyyouqj – www.freepik.com

Posted in Programming | Tagged , , , | 1 Comment

Validating Flight Networks for Drones – part 2

In part-1 I described how we validate flight-networks at Flytrex to make sure that no two nodes of a flight network are too close. Now let’s turn our attention to validating that no two edges are too close.

First, how does one define “edges being too close”? What is the distance between edges?

Here’s a good definition – the distance between edges e1 and e2 is

D(e1, e2) = min D(p1, p2) for p1 ∈ e1 and p2 ∈ e2

That means, the minimal distances of all distances between pairs of points taken from e1 and e2 respectively.

Given that definition – if two edges cross, then of course the distance between them is zero. This is useful – two edges crossing is a more extreme case of two edges being too close to one another.

So how can we implement this? Unfortunately our “closest pair” algorithm for vertices from previous post is not good enough here. I was unsure what to do – and so the easiest solution like all lazy programmers – is going to stackoverflow. Note, just looking for an existing question and answer pair is not good enough – I didn’t find a solution. As I wrote previously – it’s better to not be shy and actually ask for help.

What did I learn? I saw a suggestion to use Rtree. What is that?
Rtree is a wrapper around libspatialindex, which is a library that provides you with, well, a spatial index. A spatial index is a data structure that indexes objects according to their position in space. Read here for some theoretical details.

Visualization of an R*-tree for 3D points using ELKI (the cubes are directory pages).
Taken from the R-tree page on wikipedia

So our approach to solving our problem is relatively straightforward. For each edge in our graph, we will add it to our spatial index. Then, for each edge we will look up the closest edges to it, and make sure that distance between our edge and the closest edges is still more than our MINIMAL_DISTANCE.

How many edges should we retrieve? Well, that depends. Obviously, if two edges share a vertex then the distance between them is zero, however, we do not want to alert on edges that share a vertex. So we need to get the N closest edges such that N > number of neighbour edges of our edge. For some simple graphs that is simple, but for the general case that might include all of the edges (e.g. a sun-shaped graph with a single vertex all edges share.)

Here is some sample code:

def _prepare(self):
    for idx, edge in enumerate(self.edges):
        self.node_to_edges[edge.from_vertex_id].add(idx)
        self.node_to_edges[edge.to_vertex_id].add(idx)
        self.tree.add(idx, self._get_bounds(idx))
def validate(self):
    for idx, edge in enumerate(self.edges):
        neighbor_edges = (
            self.node_to_edges[edge.from_vertex_id] |
            self.node_to_edges[edge.to_vertex_id]) - {idx}
        # The +10 there is because we want the few edges that are not
        # our neighbors. (The distance to our neighbors will always
        # be 0 and we should ignore them)
        nearest = self.tree.nearest(
            self._get_bounds(idx), len(neighbor_edges) + 10)
        for nearest_idx in nearest:
            if nearest_idx in neighbor_edges or nearest_idx <= idx:
                continue
            dist = self._get_distance(idx, nearest_idx)
            if dist < MIN_EDGE_DISTANCE_M:
                p1 = self.id_to_node[edge.from_vertex_id]
                p2 = self.id_to_node[edge.to_vertex_id]
                edge2 = self.edges[nearest_idx]
                q1 = self.id_to_node[edge2.from_vertex_id]
                q2 = self.id_to_node[edge2.to_vertex_id]
                raise ValidationError(
                    f"The edges {p1.name} -> {p2.name}" 
                    f"and {q1.name} -> {q2.name}" 
                    f"are too close to one another. "
                    f"Distance is {dist:.1f}m but should be" 
                    f"{MIN_EDGE_DISTANCE_M}m")

The result of this code worked well. It manages to find issues in flight networks, and does it in reasonable speed.

For myself, I’m happy to have added another tool to my arsenal – Rtree.

Posted in Algorithms | Tagged , , , | Leave a comment

QA by Child

I recently published a home project I was working on, an app to teach children to read Hebrew. I wrote it originally to help my son learn to read Hebrew.

A screenshot of “Learn to Read Hebrew Easily”

In an early version my son was very excited to play it. He quickly understood the principle – see a word, then tap one of four emojis this word describes. Every time you tap a right answer, you get a few more points, which in that early version were displayed prominently at the top of the screen.

It took him less than five minutes to find a “cheat” – if you tap the right answer very quickly many times – you get points for every time you tap it, as long as the “correct answer” animation is running and the word is not changed.

It reminds that a few years back I was working on Desti and when I gave that same kid an early version of our iPad app – he broke it in less than 30 seconds just by moving his hand on the screen and touching everything at once.

Generally, if you have a GUI, one of the ways to find issues is to let a kid hack at it. One reason is that GUIs have the curious property that changes take non-zero time, and usually buttons are not disabled once they are initially tapped. As a result – you sometimes get the effect multiple times – which can result in extra score, multiple transitions, repeated actions on now incorrect state, and so on. In the extreme case this can lead to resource exhaustion very quickly and your app crashing. I’ve seen that happen to my app!

What else do you get by giving your app to a child? You can see very quickly if your UI and UX are clear and easy to understand. If you need to explain what needs to be done – it’s not good enough. That’s true in general – and doubly so for a kid. If your kid gets it on their own – you did something right.

More deeply than UX- you can learn if your gamification works. Is your app/game engaging enough? Does it invite gameplay? Does the meta-game encourage repeated plays? It took me a lot of thought to get my reading app to work well – and it’s far from done.

Did you test your app with your kid? What was your experience with that?

Posted in Game Development | Tagged , , | Leave a comment

How to hire a freelancer – 25 useful tips

Over the years I’ve had many opportunities to work with freelancers. Recently a friend had a bad experience looking for freelancers, and I tried to help her find new ones. That prompted me to write a bit about my method. While not completely foolproof, it will increase your chances of finding better freelancers.

Defining the job offer

Before we start, there are a few questions you need to ask yourself.

1) Is this for a job I know how to do myself, but just need someone else to do it, or is it for a job I don’t know how to do myself?

Sometimes, I need someone to do some programming for me. Sometimes it’s some programming that I know how to do but just need manpower, and sometimes it’s in an area that I’m really unfamiliar with, say, machine-learning image analysis. Sometimes it’s for something completely different – e.g. narration, or graphic design – where I really can’t do it myself.

2) For a job that I don’t know how to do myself, do I know how to evaluate the quality of the work?

Let’s say I’m looking for a narrator – I can certainly judge the quality of the work. My judgement might not be the best – I will probably miss some finer points, but I will still be able to see if someone did a good job and it sounds good to me. On the other hand, let’s say I’m looking for someone to port my system to Azure for me, and at best I’m familiar with AWS. They might be making major mistakes in design and I wouldn’t know, because I’m not familiar with Azure.

3) Do I know how to evaluate the quality of the freelancer?

Prior to hiring the freelancer and looking at the result of their work, how well can I predict how good of a job they will do? Of course, looking at past work is the most obvious thing, but sometimes that’s also hard. Many people can boast of impressive demo projects but still will not be able to do what I want them to. Sometimes, their past work is protected by NDAs, e.g. security advisors. In these cases I will usually need to rely on recommendations.

4) Do I have fixed requirements, or is this a project with changing requirements?

Sometimes the requirements are very clear – I need some narration done, a background drawn, some specific functionality implemented. Sometimes, I want a large feature developed, with the details not yet specified, and I need the first mockup version implemented to be able to be specific. Sometimes, I need a result “make my DB go faster” or “make my website secure” – and the particular actions to take are as yet unknown.

5) Is this a long term or a short term, one time engagement?

Sometimes you just need some small task done. Sometimes you are looking for someone to work with for the long term. If I’m looking for a short term and fixed price project, then the requirements must be well known beforehand, and I must be able to judge the quality of the result.
Long term projects can be opportunities for freelancers, so they provide you a way to get a better price or more leeway in changes requested.

6) What is my budget for this project?

That’s one of the most critical questions. You might have a total project budget – or just a monthly budget for work done. Having a tight budget might force you to be very strict about your plans and requirements. Having some experimentation budget might allow you to hire multiple freelancers and pick the best one.

Specific techniques

Once you are clear with yourself about the answers to all of these questions, many decisions will become much easier to make. Here are some techniques to handle various situations.

My requirements are around results and I don’t know what steps need to be done. Examples: “Make my code faster”, “Make my website secure”, “Propose a design to my website pretty”

  1. Make sure you know how to measure or evaluate the results. Specify clear criteria for success.
  2. Do you need just proposals for changes, or actual implementation? Actual implementation is better, unless you can clearly evaluate the proposed changes.
  3. When looking for freelancers, ask them what steps they will take to do their work, and what they expect the proposed changes are going to be. Then compare the results of multiple experts, this will allow you to evaluate who makes sense and who does not.
  4. Building on the previous step – if you had one freelancer suggest that he will do X and the other not suggest it – ask the first “Why did you propose X?” and the second “Why didn’t you propose X?”. For example, when optimizing a database, one freelancer can suggest she will “Set up a read replicate”. Ask the other one why he didn’t suggest it.

I have a lot of budget and I want to get the best freelancers

  1. It depends on what you mean by “a lot of budget” – but one easy way to get good freelancers is to hire multiple freelancers to do the job of just one, and pick the best result.
  2. A cheaper alternative, is to hire multiple freelancers to do a test task, and keep only the best. You won’t even need a lot of budget for that, and if the project is critical for you, there’s a very good chance it’s worth it.
  3. If you’re working with developers, and you are hiring multiple freelancers, consider having them code-review each other. This will increase overall quality and give you another opportunity to evaluate their work.

I am looking for a freelancer for a long-term engagement

  1. The first task must be an evaluation task, and this should be communicated to the freelancer.
  2. Similar to the previous situation – you can use the evaluation task to pick the best freelancer out of a group. You can have all freelancers do the same task, or give each one a different task, as you still get to keep the results even if you don’t continue working with them.

General advice

  1. Always communicate clearly what are your requirements, and what is the evaluation criteria you will be using. This applies to all communications; to the first message, and to your reply when the job is done.
  2. Don’t be afraid to disagree with the freelancer. Don’t be afraid to say that you want changes. Just be clear and upfront about your expectations, and keep to the terms you agreed.
  3. If you’re not an expert in the area – get a friend to give you some advice, especially if a lot of money is at stake.
  4. Don’t be afraid to add requirements and questions when you are selecting a freelancer. Worst case – they will decide not to work with you.
  5. When choosing which freelancer to work with, don’t forget to evaluate the freelancer on their communication ability. If someone doesn’t answer your job interview questions, or doesn’t understand them – how will they complete your requirements? How will they understand the urgent bug you’re trying to explain?
  6. When hiring developers – you MUST either know how to manage a development project, or have a team manager, or at least have a friend advising you.
  7. If you’re hiring for a small non-programming work, consider using fiverr.com.
  8. If you’re hiring for a more complex project or engagement, consider using upwork.
  9. If you’re using upwork, my preference is to filter on >90% job success, and prefer freelancers with significant experience on the platform – which means a lot of money earned or many job-hours done.
  10. For an additional useful listen (in Hebrew), try https://omny.fm/shows/odpodcast/shahar-erez-15-growth which I listened to recently – a lot of useful advice there.

I hope this is useful for you, please share with me in the comments your own techniques for hiring freelancers!

Posted in Projects | Tagged , , , | Leave a comment

Validating Flight Networks for Drones – part 1

A Flytrex drone lowering a package.

At Flytrex, in order to ensure drones fly only where allowed, and do not collide with each other, they are allowed to fly only in preplanned paths. However, if our goal is to deliver food to backyards, we can’t have a predefined path from our delivery center to each back yard, that would be wasteful. Instead, we created a flight network, which is a graph similar to a railway map, only instead of trains we have drones flying on it. This way, when a drone is required to a fly to a certain delivery point, a path is calculated on the flight network graph.

(Interesting note: while we could theoretically support a 3D flight network with routes above each other, we decided for simplicity’s sake to only allow planar flight networks for now.)

In order to support multiple drones flying at the same time, we also support semaphores and locking, but I’ll cover that subject at another time.

It is critical to verify that a flight network is correct and will not cause problems. We need to make sure that no two edges in the graph cross, otherwise the two drones flying through them might collide. Additionally, no two waypoints (nodes in the graph) may be too close to each other – for the same reason.

How do we do that?

First problem – for every two nodes n1 and n2, we need to make sure that distance(n1, n2) >= MIN_DISTANCE.

Second problem – for every two edges e1 and e2 that don’t share a node, we need to make sure that distance(e1, e2) >= MIN_DISTANCE. Of course, if they intersect then the distance is 0.

The simplest way to solve this is using brute force – go over every possible pair of nodes, and every possible pair of edges. This way however, is too slow – it’s the classic quadratic complexity. Can we do better?

For nodes, it is relatively easy: find the closest pair of nodes, n1 and n2. If distance(n1, n2) < MIN_DISTANCE return error, otherwise, the flight network is ok. How quickly can we implement closest-pair() for nodes? Apparently in O(nlogn), see this Wikipedia article and implementation guide.

We still have a problem though – both implementations assume Euclidean distance – and we need this implemented using e.g. Geodesic distance.

Here, this can be solved using one of two approaches:

  1. Project our nodes n1, n2 over an Euclidean surface such P(n1)=p1 and P(n2)=p2, and Geodesic-distance(n1, n2) ~= Euclidean-distance(p1, p2).
  2. Implement our closest-pair algorithm in a way that will work well enough over the earth’s surface.

(Note: happily enough, we can assume a maximum radius of a few kilometers for our flight network, otherwise this would have been much more complicated)

I tried first to go with option (1). However, how do you implement this projection correctly? I thought – let’s do a naive projection myself, using Wikipedia as a guide. Unfortunately again, this did not pan out. I took a sample of a few points, and calculated the Euclidean and Geodesic distances between them and compared. I got errors of 0%-15%.

15% error is way way too high.

Well, I don’t know, let’s get some help. Unfortunately, I wasn’t able to get this solution working using pyproj, and after some time spent I gave up on that direction. I decided to go back and try to reimplement closest-pair in a way that would work for Geodesic distances.

That worked!

To achieve that, I needed to replace all distance calculations with geodesic distance, and be able to go from a distance to latitude delta. Well, using the same calculation from before that is not too complicated. Also, we can have this support points with altitude without much effort/

Let’s write the function definition for closest-pair in a generic typed way:

def find_closest_pair(
        points: List[P],
        x_getter: Callable[[P], float],
        y_getter: Callable[[P], float],
        distance: Callable[[P, P], float],
        distance_to_x_delta: Callable[[float, P], float]) -> Tuple[P, P, float]:

    """Find the two points p1, p2 in points with the shortest distance between them
    Parameters:
        points: a list of points from which we would like to find the closest pair
        x_getter: a function that returns the x coordinate of a point
        y_getter: a function that returns the y coordinate of a point
        distance: a function that returns the distance between two points
        distance_to_x_delta: a function that given a point and a distance, 
            returns the difference in the x coordinate to get a 
            new point that is the given distance away

    Returns:
            The pair of closest points and the distance between them
    """

In Part 2 I will write about validating the distance between all edges.

Posted in Algorithms | Tagged , , , | Leave a comment

LearnLang – a small chrome extension for learning the German cases

I’ve been learning German for quite some time now. Some months ago, it came to the point where I was stuck – in order to progress I had to learn the German cases by heart.

The German Cases – By Touhidur Rahman – Own work, CC BY-SA 4.0, Link

It’s not a lot of data, and being able to understand it is relatively straightforward, however knowing it actively as part of a language takes practice.

My main sources of German practice are Duolingo, books and music. Both books and music contribute to passive knowledge rather than practice, and Duolingo just wasn’t focused enough. I decided to write something myself. It was a small itch I had to scratch!

Ideally, I just wanted exercises that given a sentence, I would have to pick the correct form of der/das/die/den/dem/des whenever it appeared. This should apply to ein/eine/eines/einer/einem/einen, dein/deine/… and mein/meine/… etc. you get the point.

To achieve that, I wrote a small chrome extension that would process a page, find all the pieces of texts to replace, and add a bit of dropdown html instead of them. Then you would pick the right option in the dropdown – it would turn into the right word with a green checkmark, otherwise you would get some toaster message saying you were wrong.

Since these days I have a full time job plus two kids – I wrote this mostly during train rides and a couple of evenings. Doing this allowed me to lean how to write a chrome extension (it’s really easy), but interestingly enough, there is a small challenge there I didn’t expect: how to regex-search through text nodes in a given HTML document and to replace the match with some HTML? The solution is apparently non-trivial.

If you decide to take the old text, add some tags and then old_tag.innerHTML = modify(text_data) you are in for a nasty surprise. If that text_data contained html tags as text – they would now be parsed as HTML. This is at best a bug, and at worst a security risk. It would appear to work, except when it won’t. Unfortunately, a lot of answers on stackoverflow suggest you do exactly that.

Well, as a lazy developer – I used somebody else’s answer, almost as is. It wasn’t even the selected answer – the selected answer used innerHTML :(

Here is the extension itself, you are welcome to try it out, e.g. on Rotkäppchen (AKA “Little Red Riding Hood”).

A demo of the extension
Posted in Programming | Tagged , , , , , , | Leave a comment

Writing a pandemic simulation

Over the last weekend I felt like programming something fun and easy, so I thought, why not write yet another pandemic/epidemic simulation.

A quick demo of simpandemic

So between helping a crying child and preparing lunch, I created simpandemic. It’s small, simplistic, but easy to play with and change parameters. As a toy project, it’s far from perfect. I implemented infection based on distance rather than collision detection, like some other simulations do, and optimized it using a grid and not a tree structure (e.g. rtree). However, it works, it is playable and very much tweak-able.

Right now it depends on pygame, which is great fun, but a bit of a pain to get it working on mac these days.

Feel free to download it, fork it, play with it, whatever. I’ll accept fun pull requests in case these actually come.

Stay healthy, stay safe, stay home!

Posted in Programming | Tagged , , , , | Leave a comment

How I learned to stop worrying and actually use StackOverflow

So apparently almost all of the developers in the world are using stackoverflow. However many developers just use StackOverflow to lookup answers, and rarely to ask their own questions. Answering other people’s questions is of course rarer still.

Up until recently I was the same: I wrote a few questions in StackOverflow, and even answered a few, but by and large I was using it to find existing answers.

This week something changed, something broke. In a way, I stopped caring. I had a problem, I didn’t find a solution fast enough, and decided, “what the heck, the solution is not obvious, I’ll just write a question”. Also, if the solution is obvious to someone else – that’s even better, I’ll learn something.

And so I asked my most recent questions, about distances between 2D segments, projections, etc. I’ll cover this subject in depth in a future blog post, as this one is about StackOverflow.

Writing a question on StackOverflow has a few advantages over not writing it. The most obvious one: you might actually get an answer! Here is a good example, my most recent question. The less obvious is that you get to put down your question in writing which just like in rubber duck debugging and that would help you with solving this problem, and practice the skill of asking the right questions.

Also important to mention – you have nothing to lose but a little bit of time. As long as your question is real and you are not clueless, asking a question will not reflect badly on you in any way, quite the opposite.

What actually surprised me is the gamification of StackOverflow – you get points for participating. I already knew about it, but I was surprised at how effective it is. Here is where I am at the time of writing this post:

My StackOverflow reputation as of 2020-03-12

Participating on SO is surprisingly addictive, and as a close friend told me there are additional advantages: once your reputation is high enough – you start getting job offers, and you can actually use that on your resume/CV (if using them is a thing you do :)

My advice to any developer reading this: you are already looking up answers on StackOverflow. If you don’t find an answer, don’t just move on. Before you do – write a question. Even if you do move on, you’ll get something valuable from it.

Posted in Programming | Tagged | Leave a comment

Back to writing

So apparently my last blog post was from 2012. That’s quite a bit of time.

flytrex drone

Since then I’ve:

  • Had a son
  • Sold my startup Desti to HERE
  • Moved with my family to Boston
  • Moved back to Israel, join Cymmetria, first as VP R&D and later as CTO
  • Had another son
  • Left Cymmetria and joined Flytrex as VP R&D

It’s not a long list, but it covers a lot of ground. Right now, Corona virus notwithstanding, I’m pretty excited about the work we do at Flytrex: we’re building a system for food delivery via autonomous drones.

Here is a short video that shows what we’re working on:

The video is by now 11 months old and the system changed a lot since then, and our main challenge right now is getting this system working in the USA.

Learning from my experience, I want to start writing regularly. To achieve that, while I will write mostly about programming, I will also write about other areas of interest. Let’s see where this new adventure takes us. Onwards!

Posted in Personal | Tagged , , | Leave a comment

Two bugs don’t make a right

Three lefts roadsign
While working on my new startup, we are doing a little bit of reasoning using implications. One of the more curious forms of implications is the negative form: consider the following exaggerated example:

  • a place being kid-friendly implies that it is not romantic.
  • a place being a strip club implies it is not kid-friendly

If we allow negative implications to be transitive, then it would follow that since being a strip club makes a place less kid-friendly, it makes it more romantic. We don’t want that. So I had to write some code to specifically ignore that situation. Before writing that, in the best tradition of TDD I wrote a test for two chained negative implications. I implemented the code, the test passed and I was happy.

For a while.

Fast forward a couple of weeks, and I’m trying out adding some negative implications, and the program doesn’t behave as expected. My code doesn’t work. I turn back to my test, check it out, and sure enough, all the thing the test asserts as True are actually True, and the test does test the right thing.

Digging deeper, I discovered the issue. I had two bugs: the first was that the code handling chained negative implications wasn’t working right. The second was in my graph building algorithm – it seems that I was forgetting to add some edges. What made that second bug insidious was that it hid the effect of the first bug from the test – effectively making the test pass.

So – for me it was – two negative implications don’t mean a positive one, and two bugs don’t make a feature.

Posted in Programming | Tagged , , , , , , | 1 Comment