Categories
Algorithms

Validating Flight Networks for Drones – part 1

A Flytrex drone lowering a package.

At Flytrex, in order to ensure drones fly only where allowed, and do not collide with each other, they are allowed to fly only in preplanned paths. However, if our goal is to deliver food to backyards, we can’t have a predefined path from our delivery center to each back yard, that would be wasteful. Instead, we created a flight network, which is a graph similar to a railway map, only instead of trains we have drones flying on it. This way, when a drone is required to a fly to a certain delivery point, a path is calculated on the flight network graph.

(Interesting note: while we could theoretically support a 3D flight network with routes above each other, we decided for simplicity’s sake to only allow planar flight networks for now.)

In order to support multiple drones flying at the same time, we also support semaphores and locking, but I’ll cover that subject at another time.

It is critical to verify that a flight network is correct and will not cause problems. We need to make sure that no two edges in the graph cross, otherwise the two drones flying through them might collide. Additionally, no two waypoints (nodes in the graph) may be too close to each other – for the same reason.

How do we do that?

First problem – for every two nodes n1 and n2, we need to make sure that distance(n1, n2) >= MIN_DISTANCE.

Second problem – for every two edges e1 and e2 that don’t share a node, we need to make sure that distance(e1, e2) >= MIN_DISTANCE. Of course, if they intersect then the distance is 0.

The simplest way to solve this is using brute force – go over every possible pair of nodes, and every possible pair of edges. This way however, is too slow – it’s the classic quadratic complexity. Can we do better?

For nodes, it is relatively easy: find the closest pair of nodes, n1 and n2. If distance(n1, n2) < MIN_DISTANCE return error, otherwise, the flight network is ok. How quickly can we implement closest-pair() for nodes? Apparently in O(nlogn), see this Wikipedia article and implementation guide.

We still have a problem though – both implementations assume Euclidean distance – and we need this implemented using e.g. Geodesic distance.

Here, this can be solved using one of two approaches:

  1. Project our nodes n1, n2 over an Euclidean surface such P(n1)=p1 and P(n2)=p2, and Geodesic-distance(n1, n2) ~= Euclidean-distance(p1, p2).
  2. Implement our closest-pair algorithm in a way that will work well enough over the earth’s surface.

(Note: happily enough, we can assume a maximum radius of a few kilometers for our flight network, otherwise this would have been much more complicated)

I tried first to go with option (1). However, how do you implement this projection correctly? I thought – let’s do a naive projection myself, using Wikipedia as a guide. Unfortunately again, this did not pan out. I took a sample of a few points, and calculated the Euclidean and Geodesic distances between them and compared. I got errors of 0%-15%.

15% error is way way too high.

Well, I don’t know, let’s get some help. Unfortunately, I wasn’t able to get this solution working using pyproj, and after some time spent I gave up on that direction. I decided to go back and try to reimplement closest-pair in a way that would work for Geodesic distances.

That worked!

To achieve that, I needed to replace all distance calculations with geodesic distance, and be able to go from a distance to latitude delta. Well, using the same calculation from before that is not too complicated. Also, we can have this support points with altitude without much effort/

Let’s write the function definition for closest-pair in a generic typed way:

def find_closest_pair(
        points: List[P],
        x_getter: Callable[[P], float],
        y_getter: Callable[[P], float],
        distance: Callable[[P, P], float],
        distance_to_x_delta: Callable[[float, P], float]) -> Tuple[P, P, float]:

    """Find the two points p1, p2 in points with the shortest distance between them
    Parameters:
        points: a list of points from which we would like to find the closest pair
        x_getter: a function that returns the x coordinate of a point
        y_getter: a function that returns the y coordinate of a point
        distance: a function that returns the distance between two points
        distance_to_x_delta: a function that given a point and a distance, 
            returns the difference in the x coordinate to get a 
            new point that is the given distance away

    Returns:
            The pair of closest points and the distance between them
    """

In Part 2 I will write about validating the distance between all edges.

Categories
Programming Python Utility Functions

Harvesting with threadmap

Dune 2 Harvester

From time to time, I need to harvest a website, or many websites. For example, to collect the data from IMDB to run the Pagerank algorithm. Other times I need to query some non-web servers.

Usually in such cases, I have a ‘read_single_url’ function that is called in a loop from a ‘read_all_urls’ function. The straightforward implementation of this will run slowly. This is not because read_single_url takes a lot of time to parse the websites it downloads. The delay is mostly due to the latency of network operations. Even on high bandwidth connections, your bandwidth utilization will be quite low.

To fix this, I wrote a function named threadmap that runs each call of read_single_url in a separate thread. Just like map, threadmap runs a given function for each element in the input sequence, and returns once all the calls are complete.

Here is an example use of the function:

threadmap.threadmap(query_server,
                    url_list,
                    max_threads=10,
                    on_exception=threadmap.IGNORE)

My first naive implementation just created a thread for each element in the list, and started them all simultaneously. This caused network IOErrors and other problems. This issue was handled by setting a maximum number of threads that may run at once.

The next issue I had to handle was exceptions. It is not obvious what is the best course of action once the inner function raises an exception. At the least, the exception has to be handled so that threadmap’s synchronizing code may be allowed to run.
My current implementation allows for a few different behaviors: ignoring the exception, aborting threadmap, retrying, and returning a default value for the problematic call. To implement these behaviors, I used the traceback module, after reading Ian Bickings’ excellent explanation of exception re-raising.

For those interested, here’s a copy of the code. I’ll be glad to read any comments or suggestions about it.