Graph library optimizations
As a few algorithms I had initially implemented were not far from using brute force methods to do their job, I had to change them to be able to create the junction tree of graphs with a few dozen nodes with no delay for the user (so far, tested up to 30 nodes and 45 edges : now creates junction tree in 135 ms, whereas before optimization it took several minutes, due to a few combinatory explosions)
- complete rewrite of the maximal cliques set determination. This was an NP-hard problem and I hadn’t taken it seriously enough at first sight. Instead of shrinking possible cliques to all possible subcliques, I grow them from known small cliques, ensure only ordered cliques are grown and remove non maximal cliques at the end. On my 30 nodes graph, getting the cliques is 100 times faster now…
- partially rewrote the graph topology : when a graph is connex (typical case if it is not oriented), there is no need to compute its full connectivity matrix (which has ones everywhere, except maybe on its diagonal). Instead, I do a quick gluton connexity check and manually determine the terms on the connectivity matrix’s diagonal (that is, which nodes have paths to themselves). I use this information to be able to report correctly the existence of paths between two nodes when the graph is connex.
- completely rewrote the triangulation algorithm. It was actually adding edges one by one and recomputing the full connectivity matrix between each addition, because it was asking the minimal path between two nodes using the list of paths between these nodes. I only needed the length of the minimal path (to check minimality of a cycle) and managed to avoid completely the recomputation of the connectivity, as I rely on a list of cycles to triangulate. I keep a map from each node to the set of cycles to triangulate that contain that node. Each time I add a new edge, I try to split all the cycles to triangulate that contain a node of this edge. If, once split, they are triangular, they are removed from the list of cycles to triangulate. The speed improvement over the previous super-exponential naive approach is incredible…
- optimized the paths finding algorithm : moved tests around, added lots of arguments to refine the query : minimal size of paths wanted, maximal size, option to skip obviously non-minimal paths, cycle-related parameters (to skip unordered cycles)…
I should now be able to test much larger graphs, with maybe up to 100, 1000 nodes (I now get performance that look like in O(nEdges)… Next step after that is to implement propagation in the junction tree…