I am learning C in 2020 – Beginner – How I learned about the standard output, input, error and pipe command

Book: Head First C

Chapter: 3

This implementation is provided in the book introduced above. I am simply going line by line to further understand.

Operation System : Windows 10

Problem

Goal

Learn standard output, standard input, standard error, the pipe command, running programs concurrently by creating a program that does the following:

  • Reads a list of names
  • Filters out all names that do not start with a given character
  • Create an error.txt file that holds all names that fulfill the above conditions and sorted.txt file , alphabetically sorted, for those that do.

Pseudo-code

  1. Read names separated by a new line character
  2. Filter out names based on their first character
  3. Output names in error.txt or winners.txt accordingly

This solution requires that you have basic understanding of pointers and arrays in c. If you need an overview, read here.

Solution

Read the names : input.c

#include <stdio.h>
#include <string.h>
int main()
{
    char name[127];
    while (scanf("%s[^\n]", &name) == 1)
    {

        printf("%s\n", name);
    }
    return 0;
    return 0;
}

Why 127 char array ? Why not 20 or 10?

No specific reason.I just assumed a name should not be longer than 127 characters.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur et mauris metus. Sed eu purus et mi dignissim varius nullam. – length = 127 bytes

What arguments does scanf take ?

%s[^\n]” → tells the program to take a string of characters up until a new line

&name → the memory location of the first char space of the name array → &name[0]

Why loop till the return value is === 1 ?

Based on this cplus article, the scanf returns the number of arguments filled successfully. Since we are filling one argument, the name variable, 1 argument should be returned.

When does the loop break?

If some reading error occurs or the end-of-file has been reached.

Why do we log the name after filling ?

The program is passing each name, all chars in the name array till the null character, to the stdout. This will serve as the stdin for another program via the pipe command ( | ) command.

Sort the data: sort.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
    char src[40][127];
    
    int count = 0;
        while (scanf("%s[^\n]", src + count) == 1)
    {

        count = count + 1;
    }
    const char * names[count];
    for (int i = 0; i < count; i++)
    {

        names[i] = src[i];
    }
    int n = sizeof(names) / sizeof(names[0]);
    sort(names, n);
    for (int i = 0; i < count; i++)
    {
        fprintf(stdout,"%s\n", names[i]);
        ///fprintf(stdout, "\n");
    }
    return 0;
}

Why src[40][127] ? why not char * src[127] ?

src[127] → a list of 127 pointers pointing to a single char or the first char of a string literal. → since string literals are unmodifiable and the scanf function expects an array of characters.

src[40][127] → a list of 40 char arrays, each able to hold 126 chars + 1 null character. All we need to do is pass each char array to the scanf.

Why pass src + count ?

src → resolves to the memory location / pointer of the first char array → &(src[0])

Count → an incrementing integer

src + count → multidimensional pointer arithmetic → e.g. src + 1 = 2nd char array in the 40 char array list called src.

Why create a new char array pointer variable “names” of length “count” ?

To remove all empty arrays that may be present in the src array. Since my data file holds 36 names, there will be 4 arrays empty in the src array.

Why didn’t you just set the src char array length to 36 instead of 40 ?

I just wanted to show how one could make it a bit flexible.

What does “const char * “ mean and why use it ?

const char * → pointer to string literal

This converts the char arrays into string literals. So the pointers in the names array do not point to a mutable char array living in the stack, rather points to the first char of a read-only char array. This conversion was needed because the following sort function takes pointers to string literals as arguments

Compare function breakdown

This compare function was borrowed from this article, which I will attempt to analyze.

static int myCompare(const void *a, const void *b)
{

   
    return strcmp(*(const char **)a, *(const char **)b);
}
void sort(const char *arr[], int n)
{
  
    qsort(arr, n, sizeof(const char *), myCompare);
}

What are the arguments of the sort function?

Const char * arr[] → an array of pointers pointing to immutable char arrays / string literals.

Int n → the number of string literals in the arr.

What is qsort ?

Sorts an array.

First argument → void* base → pointer

Second argument → size_t num → the number of elements in the array

Third argument → size of bytes occupied by the each element in the array

What are the arguments of the myCompare function?

const void *a → a pointer to immutable a[0]

const void *b → a pointer to immutable b[0]

What does strcmp do ?

int strcmp ( const char * str1, const char * str2 );

Takes two pointers, each pointing to immutable char arrays and compares both on a char level until a difference is found.

Why do the arguments look like that ?

I too was confused at first. So let’s attempt at unravelling.

*(const char **)a

const char ** → a double pointer ( read here ) or second-level pointer

*(const char **) → dereferencing → returns address of &a[0]

So basically, we are still passing the address to an immutable char array. humph…*eye rolls*

For example

#include <stdio.h>
#include <stdlib.h>

int main() {
  char notes[50] = "every good boy does fine"; // char array → stack
  const char * message = notes; // pointer to string literal |read-only
  const char **db = &message; // double pointer pointing to message
  printf("%p   = %p", *db, notes);
printf("%c   = %c", **db, notes[0]);
    return 0;

}
0x7fff3e3fa030   = 0x7fff3e3fa030
e   = e

Filter the data : filter.c

#include <stdio.h>
#include <string.h>
int main()
{
    char name[127];
    char filter = 'B';
    // scanf returns the number of values it was able to read
    while (scanf("%s[^\n]", &name) == 1)
    {
        char firstchar = name[0];
        char uppercase = toupper(firstchar);
        if (uppercase != filter)
        {
            fprintf(stderr, "%s\n", name);
        }
        else
        {
            fprintf(stdout, "%s\n", name);
        }
    }
    return 0;
}

The filter function filters all the names that start with a specific character while ignoring the case.All names that do not pass the condition are passed to the standard error stream, while the winners to the standard output.

What is the difference between fprintf and printf ?

int fprintf ( FILE * stream, const char * format, ... );
int printf ( const char * format, ... );

Fprintf, as shown above, allows us to specify whether our data should be passed to the standard output or error unlike printf. Note that fprintf(stdout, “dfg”) is identical to printf(“dfg”)

Run

gcc input.c -o input && gcc sort.c -o sort && gcc filter.c -o filter && ( input | sort | filter ) < data.txt > sorted.txt 2> error.txt

The input, sort, and filter programs are compiled to executables

( input | sort | filter ) < data.txt

  1. Since the input program is the first program in the pipeline, it takes the original data directly from data.txt. The < symbol is the command syntax for the standard input, stdin. After the input, the data outputted via printf or fprintf(stdout,….) in the input program is passed to the sort program as stdin. Similarly filter takes in the stdout data from the sort program.

> sorted.txt

The command syntax for standard output, stdout, is the “>” symbol. All strings outputted via fprintf(stdout,….) or printf in the last program, filter.exe, is saved in the sorted.txt file.

2> error.txt

Via the 2> symbol data stream can be passed to standard error. Thus all strings outputted during the fprintf(stderr, ….) call are written in the error.txt file.

Author Notes

This is most likely not the cleanest implementation, just a heads up.

Supplements

Data file

Jenna
Billy
Gifford
Inger
Gui
Ameline
Cobb
Cassey
Evyn
Wernher
Katharine
Tierney
Anallese
Kyle
Kaylee
Gerik
Kimbra
Marsha
Caz
Becki
Mireielle
Jeane
Benetta
Huntley
Kaylee
Marylynne
Farlie
Hale
Edwina
Nevsa
Amabel
Joachim
Lexi
Wendeline
Donnie

Sorted.txt

Becki
Benetta
Billy

Error.txt

Amabel
Ameline
Anallese
Cassey
Caz
Cobb
Donnie
Edwina
Evyn
Farlie
Fin
Gerik
Gifford
Gui
Hale
Huntley
Inger
Jeane
Jenna
Joachim
Katharine
Kaylee
Kaylee
Kimbra
Kyle
Lexi
Marsha
Marylynne
Mireielle
Nevsa
Tierney
Wendeline
Wernher

Links

  1. Sort array of strings
  2. C library function – scanf()
  3. Multidimensional Pointer Arithmetic in C/C++
  4. Difference between char* and const char*?
  5. C++ Converting a char to a const char *
  6. Strcmp
  7. Qsort
  8. Fprintf
  9. printf