NavBar

Thursday, 30 October 2014

Graph Implementation with C (For Programmer Graphs are different)

Graph Ready:

For a layman graphs are visual representation of representing relationship between the related quantities.We can summarise graph as it represent how one quantity changes effect the other quantity.
In mathematics we learn to draw graph in two dimensional space X-axis and Y-axis. 
In following example we can see graph represent the number of student register for the summer camp  in various streams. X-axis represent the streams and y-axis represent the number of sign ups.

For example in graph we can easily identify for drama 3 student has been register. So graph represent the data visually and it can be understand quickly.


In Programming World graphs are used to represent the relationship between the similar objects. A graph is made of vertices(objects) and edges or lines connected to objects represent the relationship between the object. So graph is an ordered pair G = (V, E) comprising a set of vertices with a set of E edges.

In above example graphs are ordered pair of G=(Cities, Distance).It represent the distance (relationship) between the cities (Objects). Above graph is an example of undirected graph. Let learn various graph terms before the detail of implementation. 

Graph Terms:

1) Vertex : Objects are represent by vertex in a graph. Vertex is represented by circle by label. In above graph all the cities are vertex.

2) Edge: Relationship or connection between the objects are represented by edges. In above example lines with KM are Edges. Edges can be directed or undirected. Above graph has all the undirected edges.
  
3) Degree of Vertex: Number of edges touches to the vertex are degree of vertex. For example Delhi has 2 degree in above graph.

4) Undirected Graph: If all the edges in a graph is bidirectional than graph is undirected graph. Above graph is Undirected graph.

5) Directed Graph: A graph in which edges are one direction. So edges graph can be use to represent the data flow. If we represent the web links in web site  as a graph then it will be directed graph.

6) Connected Graph: A graph is connected graph if there is path between any two pints in graph.

7) Disconnected Graph: A graph is disconnected if there has certain nodes, which are different from the existing graph.

Graph Steady:

This section explain about how we can implement graph using various programming language. We will discuss about the two commonly used representation of graph:

1) Adjacency Matrix (2-D Array)
2) Adjacency List (Array of Link List)

Let dig into detail of this implementation. Suppose you just reached office and you met your colleagues. So Following is the graph which represent the relationship whether you met the particular person on that day or not. Obviously everybody must see his/her manager every day :( . Not a sad thing some manager are really good ;).  Let come to point so following picture represent the graph which represent the relationship (seen/ Not Seen) to the objects(You and your colleagues.)

This graph is an ordered pair G (Person, Met/ Not).


We can come to following conclusion from the graph:
1) You met with Manager and Team Member 1
2) Manager met to everyone.
3) Team Member 1 met to you and manager.
4) Team Member 2 met to manager.

Adjacency Matrix (2-D Array):

In adjacency matrix we represent graph as 2-D array. A graph with n vertices will be represented by n*n matrix. If we need to check whether there is an edge exist between the vertices we can simply access it a[i][j].
Where a represent the base address of 2-D array and i and j represent the vertex 1 and vertex 2.
It can be represent in mathematical form:
   graphij   =  {1 if Vi -> Vj

                        0 otherwise}


For weighted graph in place of one we have Wij.

Following is the pictorial representation of above graph as 2-D array:

We can access the relationship between the objects in following manner:

1) You and manager met ?
graph[0][1]

2) Team Member(1) and Team member(2) met ?
graph[2][3]

So look up is very easy in the above representation. We have some cons for this approach:

Pros:

1) Look up is very fast in this approach. it takes constant running time O(1). As we seen in above query we can easily check whether a edge exist or not.

2) Insertion, Delation and Updating is also constant time process O(1). We directly need to update the value in array location by accessing index.

Cons:

1) Space is problem in this approach. It always take O(N*N) memory. Let suppose you have a graph with 10 vertices and 5 edges. You need 10*10*4(sizeof int) space only for storing 5*4(SizeofInt) bytes data. This approach is not suggested for sparse graph.

Adjacency List (Array of Link List):

In this representation an array of link list is used and size of array is equal to the vertices of the graph.
Let suppose you have declared an array graph[verticesCount]. Each element in the array will point to head of the link list. For example if you need to check for "manager"
graph["manager"] or graph[1] will point to link list and link list has all the vertex to which the manager is connected.
So in our case following will be link list:

You -> Team Member(1) -> Team Member(2).

Following is the visual representation of adjacency list of the above graph:


Adjacency List[] si an array which contain reference for the link list related to all the vertices. So in this representation we are stop ring based on the edges in graph. It takes less space and always recommended for sparse graph. Let first discuss pros and cons of the approach then we will dig into implementation of graph.


Pros:

1) It takes less space to represent the graph in comparison of adjacency matrix. Space complexity of with this approach is O(|V| + |E|) .

2) Insertion an edge is also take constant time O(1) in link list. Simply add after head.

Cons:

1) Look Up of edges in this approach is more complex and takes O(degree of vertex) time. We need to look for the each element in the link list.

2) Deletion and updating is also takes O(degree of vertex) time. We need to look for the element then it take constant time.

Graph Go:

In this section we will discuss the implementation of graph using c language. We will use a single code base or method to implement the both the representation.

Let go one by one all the Variables used in graph:

Enum: 

/*
 *It conatains constants for the storage type used for graph (2-D Array or link list)
 */
typedef enum {
   graph_storage_adjacency_matrix,
   graph_storage_adjacency_list
}graph_storage_type_e;

/*
 *It conatains constants various graph type Directed or Undirected
 */
typedef enum {
    graph_type_Directed,
    graph_type_Undirected
}graph_type;

So we have two enums in the graph API first use for define the constants for storage type and second for define the graph type.

Struct (Public):

/*
 *It represent the graph and the newly created garph will be refrence of this struct.
 */
struct graph_struct{
    graph_vertex_count    vertexCount;
    graph_internal        *graph_data;
};
typedef struct graph_struct SNGraph;

/*
 *It represent the vertex of graph and the newly created vertex will be refrence of this struct.
 */
struct graph_vertex_struct{
    char*      vertexName;

};
typedef struct graph_vertex_struct SNGraphVertex;

/*
 *It represent the edge of graph and the newly created edge will be refrence of this struct.
 */
struct graph_vertex_edge{
    int      weight;
};
typedef struct graph_vertex_edge   SNGraphEdge;

/*
 *It contains the graph properties and to create a new graph this property need to specify.
 */
typedef struct graph_properties_struct{
    graph_type               graphType;
    graph_storage_type_e     storageType;
}graph_properties;

Struct (Private):

/*
 *It contains the graph internal member.
 */
struct graph_internal_struct{
    graph_storage_type_e  storageType; //Storage type for graph
    graph_type            type;         //Type  of graph
    graph_vertex_count    currentCount; //Current Count of vertices added in graph
    void                  **vertices;   //Array contains  all the vertices of graph
    void                  **graph;      //refrence for all the edges of graph and structure depend          
                                         upon the storage type
};
typedef struct graph_internal_struct graph_internal;

/*
 *It contains the adjacency matrix member.
 */
typedef struct adjacency_matrix{
    int     **graph_matrix; //2-D array for adjacency Matrix
}graph_matrix;

/*
 *This structure represent the link list for adjacency list.
 */
struct graph_adjacency_list_struct{
    SNGraphVertex       *toVertex;   //refrence for the vertex
    void*               data;        //any Data related to relationship like for weighted graph 
                                     weight
    adjacency_list      *next_vertex; //Pointer to next vertex
};
typedef struct graph_adjacency_list_struct adjacency_list;

/*
 *This structure represent the head of adjacency list.
 */
struct graph_adjacency_head_struct{
    int                 degreeOfvertex; // It will have degree of vertex
    adjacency_list      *headVertex;    // It contain refrence for head for all the vertices
};
typedef struct graph_adjacency_head_struct adjacency_head;

Private and public here are only visibility shown. The private struct added in Graph.C file and public in Graph.h

Function (Public):

/*
 * Function: newGraph
 * ----------------------------
 *   allocate a new graph and return the refernce for it
 *
 *   properties: contain the properties for the graph
 *   count: the number of vertices in graph
 *
 *   returns: the refernce for the graph
 */
SNGraph* newGraph(graph_properties *properties, graph_vertex_count count);

/*
 * Function: addVertex
 * ----------------------------
 *   add a vertex in a graph
 *
 *   graph: the refrence for the graph in which vertex has to be add
 *   name: the name of vertex
 *
 *   returns: the refernce of the added vertex
 */
SNGraphVertex* addVertex(SNGraph *graph, const char* name);

/*
 * Function: addEdge
 * ----------------------------
 *   add a edge in a graph between the vertex
 *
 *   graph: the refrence for the graph in which edge has to be add
 *   vertex1: the start vertex of edge
 *   vertex2: the end vertex of edge
 *   data   : the data related to the edge (Specefic to weighted graph)
 *
 *   returns: the refernce of the added edge
 */
SNGraphEdge* addEdge(SNGraph *graph, SNGraphVertex *vertex1, SNGraphVertex *vertex2, void* data);

/*
 * Function: displayGraph
 * ----------------------------
 *   prints all the vertices with its connected neighbour
 *
 *   graph: the refrence for the graph in which vertex has to be add
 *
 *
 *   returns:
 */
void displayGraph(SNGraph *graph);

/*
 * Function: releaseGraph
 * ----------------------------
 *   release the memory used by graph.
 *
 *   graph: the pointer to refrence for the graph in which vertex has to be add
 *
 *
 *   returns:
 */

void releaseGraph(SNGraph **graphPtr);

Let us discuss the implementation in detail:

I have kept all the vertices of graph in SNGraph->graph_internal->vertices. 
It is pointer to a void pointer. It is a single dimension array in which element is pointer to a vertex of graph.


For storage of edges information or connection between the vertex i have used the pointer to a void pointer that is :

SNGraph->graph_internal->graph

It store all the edge information but implementation is different based on the storage type of graph.
Following is the  implementation:

Adjacency Matrix Implementation:

                    //Count here is number of vertex
            int** graph_matrix = calloc(sizeof(int*), count);
            for (int i = 0; i < count; i++) {
                graph_matrix[i] = calloc(sizeof(int*), count);
            }

            newGraph->graph_data->graph = (void**)graph_matrix;

Let suppose you want to check all the edges connected by Delhi. First you need to check the index of Delhi in vertices array. In our case it is zero. Then Following array will contain all the edges information for the Delhi:
 newGraph->graph_data->graph[0]

You can loop through the degree of Delhi vertex and check for all the edges.
Following is pictorial representation of that:
Adjacency List Implementation:

In case of adjacency list the same graph reference store the heads for the link list of all the vertex.

                         //Count here is number of vertex
                newGraph->graph_data->graph  = malloc(sizeof(adjacency_head*) * count);
                for (int i = 0; i < count; i++) {
                    newGraph->graph_data->graph[i] = malloc(sizeof(adjacency_head));
                    adjacency_head* graphHead = (adjacency_head*)newGraph->graph_data->graph[i];
                    graphHead[i].degreeOfvertex = 0;
                    graphHead[i].headVertex = NULL;
                }

Same case if you want to check for the connected vertex for Kanpur. First check the index in vertices. In our case it 3. So graph[3] will contain head for the link list of Kanpur.

Following is pictorial representation of that:
I have shown for first vertex. Similar link list for all the other vertices are there.
To download the full source code please click the link.
You can mail me for any issue or concern :).


Wednesday, 22 October 2014

Longest Increasing Sub-sequence(Dynamic Programming)

Problem Statement:

Suppose you have a given a integer sequence a[1..n]. You have to find the longest increasing sub sequence i.e. the sub sequence should be in increasing order with largest length possible in the given sequence.

Let suppose you have following sub sequence:


Longest increasing sub sequence is : 0, 2, 3, 4, 9, 12, 15.

is there any other longest or same length sub sequence is there?
Yes : 0, 2, 3, 4, 8, 12, 15
So both have the same length 7. Make sure no other sub sequence should have length greater than 7.

Just try ? If you able to find please add comment in the post we will discuss that.

So our main problem is how to solve this problem programatically? We will again discuss the various approach and then the one with Dynamic programming.

Solution Approach:

Brute Force Approach:
In brute force approach we will calculate all the possible sub sequence and then check for the longest sub sequence. So for calculating all the sub sequence is tedious task and take O(2n) complexity.
If three elements are there like
0
2
7
0, 2
0, 7
2, 7
0, 2, 7

So we have all the possible combination and longest is 0, 2, 7. So it is not a optimum solution.
We will not go with the solution.

Dynamic Approach:

We know that to solve any problem with dynamic approach it must have two attributes:

Optimal Substructure:
The solution is known has optimal sub structure if optimal solution can be construct from the optimal solutions of sub problem.

Can we do that in this situation ?
In above scenario what if i know that the longest sub sequence for all the indexes upto 5. Can i use that for calculating the sub sequence for  index  6 ?

Yes the sub sequence of index 6 can be

longSubsequence (6) = {
                                           max(longSubsequnce(j) + 1) where j:1...5 and  if a[6] > a[j],
                                           1        if no j found such that all a[j] is greater than a[6].
                                     }
So longest sub sequence of 6 can be calculated if we know the longest sub sequence of 5. So solution has optimum Sub structure.

Overlapping Subproblem(Recursion):
For Solving the maxSubsequence of 6 we need to solve the longestsubsequence of 5, 4, 3, 2 and1. So the problem has recursion possibly.

Solution:
We will start with the longest sub sequence for index 0 and go upto array length.
ls[0] = 1 //Only one element can be in sub sequence for length 1 array.

ls[1] = 2 // (0, 2) as a[1] is greater than a[0] So we can add it.

ls[2] = 1+ max(ls[0],ls[1]) as 7 is greater than both 0 and 2.
ls[2] = 1+max(1, 2)
         = 3

ls[3] = 1+max(ls[0], ls[1]) as a[3] < a[2] So we cannot consider that in the formula.
ls[3] = 1+max(1, 2)
         = 3

ls[4] = 1+max(ls[0], ls[1], ls[3]) as a[4] is less than a[2] so we cannot consider this
ls[4] = 1+max(1, 2, 3)
        = 4

ls[5] = 1+max(ls[0], ls[1], ls[2], ls[3], ls[4]) as 9 is greater than all the previous element.
        = 1+ max(1, 2, 3, 3, 4)
        = 5

ls[6] =  1+max(ls[0], ls[1], ls[2], ls[3], ls[4]) as 8 is less than 9 the previous element.
        =  1+ max(1, 2, 3, 3, 4)
        = 5

ls[7] = 1+max(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6]) as 12 is greater than all the previous element.
        = 1+ max(1, 2, 3, 3, 4, 5, 5)
       = 6 (0, 2, 3, 4, 9, 12) or (0, 2, 3, 4, 8, 12)
ls[8] = 1 as a[8] is less than all the previous element.

ls[9] = 1+max(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8]) as 15 is greater than all the previous element.
         = 1+ max(1, 2, 3, 3, 4, 5, 5, 6, 1)
         = 7 (0, 2, 3, 4, 9, 12, 15) or (0, 2, 3, 4, 8, 12, 15)

So we have solution with the above approach. Can we phrase the solution in one formula:

ls[i] = 1+max(ls[j]) where j:0...i if a[i] >  a[j]
           or
           1 if no such j found.

So I have implemented the solution using C. Following is the solution:

I am using a heap to store longest sub sequence for each index. Like for zero index the longest sub sequence only have one length and first element. So other we are calculating based on that above formula.

Following is heap structure:
//Use as a heap for store the longest sequence for index
typedef struct lsIncreasingSequence_struct{
    int longesubSequence[10];
    int longesubequenceLength;

}lsIncreasingSequence;

Calculating the longest sub sequence following is the code:
//return the longest sequence
lsIncreasingSequence longIncSubsequence(int sequence[], int seqL){
    lsIncreasingSequence lsubsequences[seqL+1]; //Store the longest sequence for each index
    int finalsubsequenceLength = 0;      //Store the longest sequence length
    int finalSubSequenceIndex = -1;       //Store the longest sequence index
    lsubsequences[0].longesubSequence[0] = sequence[0]; //First sequence have only first element
    lsubsequences[0].longesubequenceLength = 1//First sequence have only first element so count is 1
    
    //Loop Start from second element and count longest sequence for each index
    for (int i=1; i<seqL; i++) {
        //Loop check for all the possible sub sequence and check if length is largest
        for(int j=0; j <i ; j++){
            if(sequence[i] > sequence[j]){
                finalsubsequenceLength = lsubsequences[j].longesubequenceLength + 1;
                if(finalsubsequenceLength > lsubsequences[i].longesubequenceLength){
                    lsubsequences[i].longesubequenceLength = finalsubsequenceLength;
                    finalSubSequenceIndex  = j; //Store the largest increasing sub sequence for index i
                }
            }
        }
        
        //loop copy all the elements of longest sub sequence of i to index i of heap
        if(finalSubSequenceIndex != -1){
            int k = 0;
            for(; k < lsubsequences[i].longesubequenceLength-1; k++){
                lsubsequences[i].longesubSequence[k] = lsubsequences[finalSubSequenceIndex].longesubSequence[k];
            }
            lsubsequences[i].longesubSequence[k] = sequence[i];
        }
        finalSubSequenceIndex = -1;
    }
    //loop calculate longest increasing sub sequence from all the indexs in a heap
    finalsubsequenceLength = 0;
    for (int k=0; k<seqL; k++) {
        if(lsubsequences[k].longesubequenceLength > finalsubsequenceLength){
            finalsubsequenceLength = lsubsequences[k].longesubequenceLength;
            finalSubSequenceIndex = k;
        }
    }
    //return the longest sub sequence heap element
    return lsubsequences[finalSubSequenceIndex];

}


So the above method return the longest sub sequence.
You can download the running code from the link.
Please provide your suggestion.

Tuesday, 21 October 2014

Maximum Sub-Sequence Sum in Array (Dynamic Programming)

Problem Statement:

Suppose you have a given a integer array a[1..n]. You have to find the maximum sub sequence sum i.e. the sum of the numbers should be greater than any other sequence.

Suppose you have given an following array:


-2
-3
4
-1
-2
1
5
-3

You need to find the maximum subsequence?

We as normal human tendency always try to discard the negative number in sequence. So most of us will go with the solution 1,5 = 6 is sum of maximum sub sequence.

But that solution is wrong as the maximum sub sequence has sum 7.If you already chosen the best solution than it is great :).

Let try to search once again for the best solution without seeing below ;).

So Solution is 4, -1, -2 , 1,  5. Sum is 7.
So now we have to implement this solution by programming. We will discuss the various approach and then will try our solution with dynamic programming.


Solution Approach:

Brute Force Approach:
In brute force approach we will solve the problem in O(n) Complexity. In this approach we will iterate over each element and start doing sum until the length of array and after adding each element check sum if it is greater than max sum then replace the max sum with new value.

Note: This algorithm is implemented in this way that we always focus that sum cannot be less than zero.

Following is example

Max sum = 0;
sum = 0;

i = 0(-2)
sum = sum + (-2);
sum = -2.
So sum is less than zero so discard this sum and initialize the sum with zero.
i=1(-3)
sum = -3;
So sum is less than zero so discard this sum and initialize the sum with zero.
i=2(4)
sum = 4; 
So sum is greater than zero and max sum. So replace the max sum with sum and keep the sum.
i=3(-1)
sum = 3
So sum is greater than zero but not greater than max sum. So keep the sum but not replace with max sum.
i=4 (-2)
sum = 1
So sum is greater than zero but not greater than max sum. So keep the sum but not replace with max sum.
i=5(1)
sum = 2
So sum is greater than zero but not greater than max sum. So keep the sum but not replace with max sum.
i=6(5)
sum = 7
So sum is greater than zero and max sum(4). So replace the max sum with sum and keep the sum.
i=7(-3)
sum = 4
So sum is greater than zero but not greater than max sum. So keep the sum but not replace with max sum.
So in the end we have 7 in max sum this is our solution. We solve the problem in linear time.
The solution we provide here is according to well known Kadane's algorithm.

Dynamic Approach:

Dynamic programming solution is similar to Kadane's algorithm. We already know to solve any problem with dynamic programming it must have to attributes.

Optimal Substructure: We can solve the problem of length n by solving the lower problem. If we know the solution for i then for i+1 following will be the solution:
max(solution[i]+a[i+1],  a[i+1]);
solution[i] denotes maximum sub sequence sum for ith element.

But in this case storing solution[i] is an over head. We can simple solve the problem in linear time with time complexity O(N) and space complexity with O(1). But our topic is dynamic programming we are providing solution through dynamic approach.

Solution
Running Time : O(N)
Space Complexity: O(N)

Solution:

int calculateMaximumSubSequence(int array[], int size){
    int max[size]; // Store max sub sequence at each index. We do not need this as Kadan's algo but for dynamic approch we are using that
    memset(max, 0, size);
    int maximumSubsequence = 0; // Store max sub sequence for array
    max[0] = array[0];
    for(int i = 1; i < size;i++){
        // Check if previous max sib sequence is greater than zero. If it is negative then our condition solution[i]+a[i+1] is always less than a[i+1]
        if(max[i - 1] > 0){
            max[i] = max[i-1]+array[i];
        }else{
            max[i] = array[i];
        }
        // Store the max sub sequence for array
        if(maximumSubsequence < max[i]){
            maximumSubsequence = max[i];
        }
    }
    return maximumSubsequence;

}

You can download full source code from the link:
Please provide your input and suggestion on the tutorial.