Star Catalog 1.0

This post covers the first part of a project to develop a star catalog in C. The aims of this initial stage are quite modest - we will write a program to import star data from a file into a data structure and then print it to the screen in a neat format.

Future plans for the project include calculating a few extra pieces of information from existing data, sorting and filtering the data, and exporting it to various other formats.

All these things are specific to stellar data but of course the principles and coding techniques can be applied to any field.

The Raw Data

The data for this project consists of a file containing data on the brightest 300 stars visible from Earth, the file being included in this post's download zip. To give you an idea of what it looks like these are the first few lines.

StarCatalogue.csv (first 6 lines)

Star|Constellation|RightAscension|Declination|Apparent Magnitude|B-V|Absolute Magnitude|Distance LY|Name
alpha|Andromeda|00 08.2|+29 04|2.1|-0.11|-0.4|100|Alpheratz
beta|Cassiopeia|00 09.0|+59 08|2.3|0.34|2|45|Caph
gamma|Pegasus|00 13.1|+15 10|2.8|-0.23|-3.1|490|Algenib
beta|Hydrus|00 25.4|-77 17|2.8|0.62|3.8|20|-
alpha|Phoenix|00 26.2|-42 20|2.39|1.09|0.7|62|Ankaa
delta|Andromeda|00 39.2|+30 50|3.27|1.28|-0.3|170|-

As you can see, the first line consists of column headings, and the data itself is pipe-separated (the | character) rather than comma-separated. This is a far more versatile and safe separator than the comma as it is rarely used in actual data.

I will give a brief description of each column for anybody who is interested. If you know a bit about astronomy already just skip to the next section.

  • Star and Constellation - the brightest stars are typically identified by a Greek letter followed by their constellation (theoretically in descending order of brightness although this sometimes goes wrong)

  • Right ascension and declination - this is the equivalent of longitude and latitude on Earth, although they are based on fixed points in space rather than points on the moving surface of Earth

  • Apparent magnitude - the brightness of the star as seen from Earth, the brighter the star the lower the number.

  • B-V - this is a measure of a star's colour and temperature

  • Absolute magnitude - the brightness of the star as it would be seen from a fixed distance of 32.6 light years

  • Distance in light years - the number of years light has travelled from the star to Earth. The nearest star to us is Proxima Centauri at 4.25 light years.

  • Name - most of the brightest stars have a common name as well as a Greek letter + constellation name, many of Arabic origin

Star Catalog 1.0 Specification and Design

The requirements for this phase are simply to read the file into a data structure and print it out. For this we will need a struct to hold the data on each individual star, a struct for a list of all stars, and several functions to carry out the required tasks. These functions are:

  • starcatalog_new - create a new empty star catalog
  • starcatalog_import_csv - read data from the specified file and use it to populate a star catalog
  • starcatalog_append - append individual stars to a catalog
  • starcatalog_print - print the data in a neatly formatted way
  • starcatalog_free - free up dynamically allocated memory

Coding

Create a new folder somewhere and in it create the following empty files. You can also download the source code from the Downloads page or clone/download from Github. You will need the download anyway to get the csv file, even if you decide to type the code by hand.

  • starcatalog.h
  • starcatalog.c
  • main.c

Open the header file and enter the following.

starcatalog.h

#include<stdlib.h>
#include<stdbool.h>
#include<stdio.h>

//--------------------------------------------------------
// STRUCT star
//--------------------------------------------------------
typedef struct star
{
    char star[16];
    char constellation[32];
    char right_ascension[16];
    char declination[16];
    double apparent_magnitude;
    double B_V;
    double absolute_magnitude;
    double distance_light_years;
    char name[32];
} star;

//--------------------------------------------------------
// STRUCT starcatalog
//--------------------------------------------------------
typedef struct starcatalog
{
    star** stars;
    unsigned long size;
} starcatalog;

//--------------------------------------------------------
// FUNCTION PROTOTYPES
//--------------------------------------------------------
starcatalog* starcatalog_new(void);
bool starcatalog_import_csv(starcatalog* sc, char* filepath);
bool starcatalog_append(starcatalog* sc, star* s);
void starcatalog_print(starcatalog* sc);
void starcatalog_free(starcatalog* sc);

The first struct is star, and as the string fields are small I have hard coded array sizes for them rather than dynamically allocating the required amount of memory. We then have the starcatalog struct which simply holds a pointer to the stars themselves and the size. ie number of stars. Finally there are the prototypes for the functions listed earlier.

Now we can move on to the starcatalog.c file.

starcatalog.c part 1: #includes and starcatalog_new

#include<stdlib.h>
#include<stdbool.h>
#include<stdio.h>
#include<string.h>

#include"starcatalog.h"

//--------------------------------------------------------
// FUNCTION starcatalog_new
//--------------------------------------------------------
starcatalog* starcatalog_new(void)
{
    starcatalog* sc = malloc(sizeof(starcatalog));

    sc->stars = NULL;
    sc->size = 0;

    return sc;
}

After the #includes comes starcatalog_new which simply allocates, initializes and returns a new star catalog.

starcatalog.c part 2: starcatalog_append

//--------------------------------------------------------
// FUNCTION starcatalog_append
//--------------------------------------------------------
bool starcatalog_append(starcatalog* sc, star* s)
{
    if(sc->stars == NULL)
    {
        sc->stars = malloc(sizeof(star));
    }
    else
    {
        sc->stars = realloc(sc->stars, (sizeof(star) * (sc->size + 1)));
    }

    if(sc->stars != NULL)
    {
        sc->stars[sc->size] = s;

        sc->size++;

        return true;
    }
    else
    {
        return false;
    }
}

The append function takes a starcatalog and a star, and appends the star to the catalog. Firstly it uses malloc or realloc (depending on whether there are already stars in the catalog) to obtain or increases memory. If this is successful we add the star and increment the size.

starcatalog.c part 3: starcatalog_import_csv

//--------------------------------------------------------
// FUNCTION starcatalog_import_csv
//--------------------------------------------------------
bool starcatalog_import_csv(starcatalog* sc, char* filepath)
{
    char line[1024];

    FILE* fp;

    fp = fopen(filepath, "r");

    star* ps = NULL;

    if(fp != NULL)
    {
        // skip first row - headings
        fgets(line, 1024, fp);

        while(fgets(line, 1024, fp) != NULL)
        {
            ps = malloc(sizeof(star));

            if(ps != NULL)
            {
                sscanf(line,
                       "%16[^|]|%32[^|]|%16[^|]|%16[^|]|%lf|%lf|%lf|%lf|%32s",
                       ps->star,
                       ps->constellation,
                       ps->right_ascension,
                       ps->declination,
                       &ps->apparent_magnitude,
                       &ps->B_V,
                       &ps->absolute_magnitude,
                       &ps->distance_light_years,
                       ps->name);

                if(starcatalog_append(sc, ps) == false)
                {
                    fclose(fp);

                    return false;
                }
            }
            else
            {
                fclose(fp);

                return false;
            }
        }

        fclose(fp);

        return true;
    }
    else
    {
        return false;
    }
}

The most complex piece of code in this projest is starcatalog_import_csv. This takes a starcatalog pointer and a file path, and attempts to populate the catalog from the specified file.

Firstly we need a few variables: line is an input buffer used to hold individual lines from the file, fp is simply a FILE pointer, and ps is a star pointer which will be used to create each new star.

We attempt to open the file for reading and if this is successful call fgets once just to skip the first line of headings. After this we enter a while loop which repeatedly calls fgets to populate the line buffer with each line in the file. When it gets to the end of the file fgets returns NULL which is the exit condition for the loop.

On each iteration of the loop we use malloc to get memory for a new star, and then (if successful) use the sscanf function to copy the various parts of the raw data in the line variable to the members of the star struct.

As you can see I have split the sscanf function call onto a number of lines to make it easier to grok. The first line is the line to read data from, and the second is the format string which is less intimidating if you take it a few characters at a time.

The first part

%16[^|]|

means read up to 16 characters until you hit a '|' character (ie the first column in pipe-separated data) and then read the '|' character itself which is effectively then ignored. The following three format specifiers are in the same format and we then reach four in the form

%lf|

which read in the four double values followed by their corresponding separator. We then have the final string which does not need

[^|]|

as it needs to read to the end of the line which does not have a separator character.

The sscanf statement then has nine lines specifying the addresses of the variables to place the various items of data in, in this case the members of the star struct. The strings are of course char pointers so can be used as they are but for the double members we need to use the & operator to obtain their addresses.

I think the worst is over so now we just need to stick our new star onto the end of the list using starcatalog_append. This function uses malloc or realloc and returns false if it fails so we need to check this and return false if so, not forgetting to tidy up by closing the file first.

starcatalog.c part 4: starcatalog_print

//--------------------------------------------------------
// FUNCTION starcatalog_print
//--------------------------------------------------------
void starcatalog_print(starcatalog* sc)
{
    char line[172];
    memset(line, '-', sizeof(line));
    line[171] = '\0';

    puts(line);

    printf("|%-12s|%-23s|%-23s|%-17s|%-13s|%-20s|%-20s|%-11s|%-22s|\n",
     " Star", " Constellation", " Name", " Right Ascension", " Declination", " Apparent Magnitude", " Absolute Magnitude", " B-V", " Distance Light Years");

    puts(line);

    for(int i = 0; i < sc->size; i++)
    {
        printf("| %-10s ", sc->stars[i]->star);
        printf("| %-22s", sc->stars[i]->constellation);
        printf("| %-22s", sc->stars[i]->name);
        printf("| %-16s", sc->stars[i]->right_ascension);
        printf("| %-12s", sc->stars[i]->declination);
        printf("| %18f ", sc->stars[i]->apparent_magnitude);
        printf("| %18f ", sc->stars[i]->absolute_magnitude);
        printf("| %9f ", sc->stars[i]->B_V);
        printf("| %20f |\n", sc->stars[i]->distance_light_years);
    }

    puts(line);
}

In essence the print function is simple but there are a few points to note:

  • The data is printed with what you might generously describe as grid lines. The horizontal lines consist of 171 '-' characters so to avoid hard coding them all (and to show a nifty bit of code!) I have just declared a char array and populated it using memset, not forgetting to terminate the string with \0.

  • The line which prints the headings looks a bit complex but is basically a row of format specifiers in the form

    "|%-12s|"

    These print a '|' character, the string left-aligned and padded to the specified width, and another '|' character. After this come the strings to print. I could have hard-coded the heading as a single string but having the widths as numbers makes changes easier.

  • The for loop through the data uses similar formatting in the form

    "| %-10s"

starcatalog.c part 5: starcatalog_free

//--------------------------------------------------------
// FUNCTION starcatalog_free
//--------------------------------------------------------
void starcatalog_free(starcatalog* sc)
{
    for(int i = 0; i < sc->size; i++)
    {
        free(sc->stars[i]);
    }

    free(sc->stars);

    free(sc);
}

The last function is starcatalog_free which firstly frees the memory used by individual stars, then the stars array, and finally the catalog. We can now write the main function to try out all the above.

main.c

#include<stdio.h>
#include<stdlib.h>

#include"starcatalog.h"

//--------------------------------------------------------
// FUNCTION main
//--------------------------------------------------------
int main(int argc, char* argv[])
{
    puts("------------------------------------\n| code-in-c.com - Star Catalog 1.0 |\n------------------------------------\n");

    starcatalog* psc = starcatalog_new();

    if(psc != NULL)
    {
        if(starcatalog_import_csv(psc, "StarCatalogue.csv"))
        {
            starcatalog_print(psc);

            starcatalog_free(psc);

            return EXIT_SUCCESS;
        }
        else
        {
            puts("Cannot import star catalog");

            return EXIT_FAILURE;
        }
    }
    else
    {
        puts("Cannot create star catalog");

        return EXIT_FAILURE;
    }
}

The main function calls starcatalog_new and then, if this is successful, calls starcatalog_import_csv. Again, if successful, it goes on to call the print and free functions.

We can now compile and run the program with these commands...

Compile and Run

gcc main.c starcatalog.c -g -std=c11 -lm -o main
./main

...which will give us the following output.

Program Output (Partial)

------------------------------------
| code-in-c.com - Star Catalog 1.0 |
------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Star       | Constellation         | Name                  | Right Ascension | Declination | Apparent Magnitude | Absolute Magnitude | B-V       | Distance Light Years |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| alpha      | Andromeda             | Alpheratz             | 00 08.2         | +29 04      |           2.100000 |          -0.400000 | -0.110000 |           100.000000 |
| beta       | Cassiopeia            | Caph                  | 00 09.0         | +59 08      |           2.300000 |           2.000000 |  0.340000 |            45.000000 |
| gamma      | Pegasus               | Algenib               | 00 13.1         | +15 10      |           2.800000 |          -3.100000 | -0.230000 |           490.000000 |
| beta       | Hydrus                | -                     | 00 25.4         | -77 17      |           2.800000 |           3.800000 |  0.620000 |            20.000000 |
| alpha      | Phoenix               | Ankaa                 | 00 26.2         | -42 20      |           2.390000 |           0.700000 |  1.090000 |            62.000000 |
| delta      | Andromeda             | -                     | 00 39.2         | +30 50      |           3.270000 |          -0.300000 |  1.280000 |           170.000000 |

I admit that was rather a lot of work just to get at a load of data we could have looked at in its original csv file (albeit not so nicely formatted) but as I mentioned above this is just the first phase, and there is plenty more we can do with this data: calculate derived values, filter and search, sort and export.

Please let me have your comments and suggestions, and follow Code in C on Twitter for news of future posts.

Leave a Reply

Your email address will not be published. Required fields are marked *