Wednesday, July 18, 2012

Fibonacci Numbers

Wrote a couple of versions of Fibonacci number generators. 

Interestingly, the numbers become inaccurate at just the 93rd fib number in the sequence: the real numbers begin losing precision on the low end.   I am going to have to figure out how to use one of the bignum libraries in order to work exactly with such large values.  My fear is that the bignum libraries have a substantial performance hit in return for the accurate values.

The recursive version becomes too slow to use at just the 40th number in the sequence, while the non-recursive version fires through thousands of values.


Fib-recursive.c
 
#include <stdio.h>

/* This program calculates fibonacci sequence using recursion

 1 1 2 3 5 8 13 21 34
 
 Starting with 1
 Each number in the sequence is created by adding the previous two numbers in sequence.
 
*/


long double fib(long double value){
  if (value <= 2)
    return 1;
    
  return (fib(value-1) + fib (value-2));
}

int
main(){

long double i;

  for (i = 1; i<1000; i++)
     printf ("value is %.0Lf and return value is %.0Lf\n", i, fib(i)); 

}
 
 
 
Fib-nonrecursive.c
 
#include <stdio.h>

/* This program calculates fibonacci sequence using recursion

 1 1 2 3 5 8 13 21 34
 
 Starting with 1
 Each number in the sequence is created by adding the previous two numbers in sequence.
 
 If you notice, something bad happens when we reach number 93 and we lose precision off the end.
 The number is close to the right number but is subtly wrong.
 
 We need to switch to a bignum library to add numbers that large precisely.
 
 Interestingly we can calculate the first thousand numbers in the sequence this way before we can calculate the first 30 numbers recursively.
 
*/


int
main(){

long double i, j, p1, p2;

  p1 = 0;
  p2 = 1;
  i  = 0;
  j = 1;
  
  while (j<1000){
    i = p1 + p2;
    printf ("\t%.0Lf\t%.0Lf\n", j, i);
    p2 = p1;
    p1 = i;
    j++;
  }

} 
 
 

Monday, July 9, 2012

Simple web server using epoll and sendfile.

I have been looking at how to use sendfile() and epoll() for a while now, so I finally sat down and wrote a very simple web server that uses these two sets of functions.  The resulting code follows this text.  This web server just performs a get request, and returns the file at the requested path.  If the file cannot be found, it returns a 401 error, file not found.

sendfile() is useful because it dramatically speeds up and simplifies reading a file from disk and sending it out the network file descriptor.   Typically a web server has to have the file and the network socket open, and then read a block from the file, then write the block to the disk, this results in the data being copies between several buffers and the buffer management involved in that.   However the sendfile() man page recommends to fall back to the normal read and write in cases where


epoll() is great because it doesn't have the overhead of select, or most of the limitations.  With select you have to loop through every element in a large, fixed size array, testing each one to see if you got a connection on that element.   With epoll you get a list of elements to handle that you can quickly process, limiting unproductive looping on the possibly thousands of elements you  did not have to process.  Epoll can also easily handle thousands of simultaneous connections without slowing down like select would.


This program does not perform cgi, send mime types, or perform any http request other than get.    It assumes that the first data block of the request contains the entire request.  The program assumes one request per connection.  It ignores any trailing text after the first line.  The beauty is in it's compiled size, it is just 9.5KB in size and is very easy to understand.  


This program only handles a single request at a time because the sendfile request blocks.

Next steps:
  1. Break code into generic epoll server file and server specific files, with call backs in the generic epoll side to the server side to handle new connect, data request, and closed connections.  This will allow the generic epoll server stuff to be completely generic and usable for various other projects.  If it is a library then this could be shared among multiple executing programs.
  2. Handle the request correctly to entirely process the request header and attach all the data to the connection, then perform the request by the request type.
  3. Handle mime types.  Each file should have a mimetype header sent before the data based on the extension of the file.
  4. Handle things other than just "GET".
  5. Handle multiple requests on a single connection.
  6. Add becoming a deamon and signal handling to code to properly handle various tasks.
  7. Use a worker thread pool to get requests and process them while the main process just handles new connections, reading client requests, and closing old connections.   The worker threads can each handle a single completed request, sending the data to the client through the port.
  8. Handle the reading of the input and processing of requests in the worker thread as well, this would leave the main thread just handling new connections passing triggered events to threads.
  9. Handle cgi programs.  I already have code to do this, just needs converted to also use epoll() to monitor the open handles and process events.
  10. Add configuration file reading.
  11. Add database reading/writing for persistent data across connections using sqlite.  Allow get and put requests using the database.



/*

This is a very simple web server that listens for connections on epoll.

It only processes the first line and only understands GET and the path.

Web server will flag the OS to send file to port.

Root is the Root directory just above the location of the executabe.

*/

#include <sys/sendfile.h>
#include <pthread.h>
#include <stdio.h>
#include <sys/timeb.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/epoll.h>
#include <netinet/in.h>
#include <string.h>
#include <fcntl.h>
#include <signal.h>
#include <errno.h>
#include <stdlib.h>

#define MAX_CLIENT 5000
#define PORT 8080

void nonblock(int sockfd)
{
  int opts;
  opts = fcntl(sockfd, F_GETFL);
  if(opts < 0)
  {
    perror("fcntl(F_GETFL)\n");
    exit(1);
  }
  opts = (opts | O_NONBLOCK);
  if(fcntl(sockfd, F_SETFL, opts) < 0) 
  {
    perror("fcntl(F_SETFL)\n");
    exit(1);
  }
}


void
Initialize(){

}


/* Parse and return file handle for http get request */
int
OpenFile(char * buf, int n){

  int i, j;

  char * index = {"index.html"};

  char path[1024];

  if (buf==NULL || n<10)
    return 0;

  // Ensure that the string starts with "GET "
  if (!buf[0]=='G' || !buf[1]=='E' || !buf[2]=='T' || !buf[3]==' ')
    return 0;

  // copy the string
  for (i=0; i < n && buf[i+4]!=' '; i++)
    path[i]=buf[i+4];  

  // put null on end of path
  path[i]='\x00';

  // if the syntax doesn't have a space following a path then return
  if (buf[i+4]!=' ' || i == 0)
    return 0;

  // if the last char is a slash, append index.html 
  if (path[i-1] == '/')
    for (j=0; j<12; j++)
        path[i+j]=index[j];

  printf("Found path: \"%s\"\n", path);

  // the +1 removes the leading slash
  return (open(path+1, O_RDONLY));


}

void
Respond(int fd, char * buf, int n){

  char badresponse[1024] = {"HTTP/1.1 404 Not Found\r\n\r\n<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\r\n<TITLE>Not Found</TITLE></HEAD><BODY>\r\n<H1>Not Found</H1>\r\n</BODY></HTML>\r\n\r\n"};
  char goodresponse[1024] = {"HTTP/1.1 200 OK\r\n\r\n"};

  int fh; 

//  printf("%d data received: \n%s\n", fd, buf);
  
  fh = OpenFile(buf, n);

  if (fh > 0){

    struct stat stat_buf;  /* hold information about input file */

    send(fd, goodresponse, strlen(goodresponse), 0);

    /* size and permissions of fh */
    fstat(fh, &stat_buf);
    sendfile(fd, fh, NULL, stat_buf.st_size);
    close(fh);

  } else {

    send(fd, badresponse, strlen(badresponse), 0);
  }
  close(fd);
}


void
Respond_simple( int fd, char * buf, int n){

  char response[1024] = {"HTTP/1.1 200 OK\r\n\r\n<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\r\n<TITLE>301 Moved</TITLE></HEAD><BODY>\r\n<H1>Hello World!</H1>\r\n</BODY></HTML>\r\n\r\n"};
  int y = strlen(response);

  printf("%d data received: \n%s\n", fd, buf);
  send(fd, response, y, 0);
  close(fd);

}

void
Loop(){

  struct epoll_event *events;
  struct sockaddr_in srv;
  struct epoll_event ev;

  int listenfd;
  int epfd;
  int clifd;
  int i;
  int res;

  char buffer[1024];
  int n;

  events = (struct epoll_event *)calloc(MAX_CLIENT, sizeof(struct epoll_event));
  
  if( (listenfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
  {
    perror("error opening socket\n");
    exit(1);
  }

  bzero(&srv, sizeof(srv));
  srv.sin_family = AF_INET;
  srv.sin_addr.s_addr = INADDR_ANY;
  srv.sin_port = htons(PORT);

  if( bind(listenfd, (struct sockaddr *) &srv, sizeof(srv)) < 0)
  {
    perror("error binding to socket\n");
    exit(1);
  }
  
  listen(listenfd, 1024);

  int reuse_addr = 1;           /* Used so we can re-bind to our port */
  /* So that we can re-bind to it without TIME_WAIT problems */
  setsockopt (listenfd, SOL_SOCKET, SO_REUSEADDR, &reuse_addr,
              sizeof (reuse_addr));

  epfd = epoll_create(MAX_CLIENT);
  if(!epfd)
  {
    perror("epoll_create\n");
    exit(1);
  }
  ev.events = EPOLLIN | EPOLLERR | EPOLLHUP;
  ev.data.fd = listenfd;
  if(epoll_ctl(epfd, EPOLL_CTL_ADD, listenfd, &ev) < 0)
  {
    perror("epoll_ctl, failed to add listenfd\n");
    exit(1);
  }


  for( ; ; )
  {
    usleep(8000);
    res = epoll_wait(epfd, events, MAX_CLIENT, 0);
    for(i = 0; i < res; i++)
    {
      if(events[i].data.fd == listenfd)
      {
        clifd = accept(listenfd, NULL, NULL);
        if(clifd > 0)
        {
          printf(".");
          nonblock(clifd);
          ev.events = EPOLLIN | EPOLLET;
          ev.data.fd = clifd;
          if(epoll_ctl(epfd, EPOLL_CTL_ADD, clifd, &ev) < 0)
          {
            perror("epoll_ctl ADD\n");
            exit(1);
          }
        }
      }
      else {
        n = recv(events[i].data.fd, buffer, 1023, 0);
        if(n == 0)  /* closed connection */
        {
          epoll_ctl(epfd, EPOLL_CTL_DEL, events[i].data.fd, NULL);
        }
        else if(n < 0)  /* error, close connection */
        {
          //epoll_ctl(epfd, EPOLL_CTL_DEL, events[i].data.fd, NULL);
      close(events[i].data.fd);
        }
        else {  /* got a request, process it. */
          //send(events[i].data.fd, buffer, n, 0);
          //bzero(&buffer, n);
          //printf("%d data received: \n%s\n", events[i].data.fd, buffer);

          //send(events[i].data.fd, response, y, 0);
      //close(events[i].data.fd);

      Respond(events[i].data.fd, buffer, n);

        }
      }
    }
  }

}

void Shutdown(){

}

int main (){

  Initialize();
  Loop();
  Shutdown();

  exit(0);
}