BufferedReader of The EOF Kind

Introduction

I have been adding automated tests in projects that I have written at work. One of these projects outputs CEF formatted messages to syslog. The test required that /var/log/messages be read and compare the CEF message with the known good message. I ran into a problem while reading the file.

What Problem?

Even a novice Java developer knows that reading a text file is no big deal, just chain the right streams together and the contents will be easily extracted. I wanted to read the file line by line so I chained java.io.FileReader inside a java.io.BufferedReader. Everything should work, right? Indeed each line was dutifully read and was displaying on my log except the last line, the exact line that I was trying to compare. I found the root of the cause in the javadocs themselves. Here is the excerpt from the BufferedReader’s javadoc:

Reads a line of text. A line is considered to be terminated by any one of a line feed (‘\n’), a carriage return (‘\r’), or a carriage return followed immediately by a linefeed.

That sounds reasonable, at the end of a line, a line is returned. This works 99.9% of the time but what if the line ends with an end of file(EOF)? I will tell you what happens, a null is returned! Remember that java.io.BufferedReader considers that a line ends in a line feed, carriage return or a carriage return followed by a line feed, not a EOF. Syslog ends its logs with a EOF not a line terminator at the end of the last line. This means java.io.BufferedReader as the javadoc states will not show the last line but return a null because there are no more end of line indicators before the end of the file. This is a major problem for my test because I have to check the last line in most cases to validate output. I am going to say this is an oversight for the makers of the Java libraries.

What To Do, What To Do

I obviously needed to modify the definition of what the end of a line was by java.io.BufferedReader. My first thought was to extend BufferedReader so I started the checklist. The class is not final but the buffer underneath was not visible. This is a problem because the only way to get to the buffer is going through the java.io.Reader interface. So my initial thought was to override the readLine method and use the read() method to get the contents one by one to parse each line. That seemed a waste to make repeated method calls just to access a buffer. It would be better to just read the buffer directly and increment a counter. This meant that I would have to basically recreate BufferedReader. “This should be no problem. I am an experienced developer. This will be easy!” I thought to myself.

A lesson Learned

Four days and 26 automated tests later, the beast was done. While it is more efficient to to use a counter on an array instead of repeated method calls, it did not make good time management sense. Remember this was for an automated test, not code destined to be delivered to the customer. A second or two extra per test run does not matter in the long run. Getting it done in a quarter of the time so tests can be run does matter. In my mind, the sooner a test is written, the more time can be saved and time is money.

The Solution

In theory, the concept is simple, the buffer is filled from the inner reader. If the buffer is full, it is cleared and another set of data is read from the inner reader. This repeats until the inner reader returns -1. The main methods are skip(long n), read(char[] data, int offset, int length), readLine() and fillBuffer().

Skip(long n)

    @Override
    public long skip(long n) throws IOException {
        long numSkipped = 0;
        long leftToSkip = n;
        int lenRead = 0;
        while(leftToSkip > 0 && lenRead != -1) {
            if((offset + leftToSkip)  endIndex) {
                lenRead = fillBuffer();
                if(lenRead != -1) {
                    int amountBuffered = endIndex - offset;
                    long amountToSkip = (amountBuffered < leftToSkip)? 
                            amountBuffered:leftToSkip;
                    offset += amountToSkip;
                    numSkipped += amountToSkip;
                    leftToSkip -= amountToSkip;
                }
            }
        }

        return numSkipped;
    }

Read(char[] data, int offset, int length)

    @Override
    public int read(char[] cbuf, int off, int length) throws IOException {
        int totalRead = 0;
        boolean noReads = true;
        int targetOffset = off;
        int newTargetOffset = off + length;
        int readLen = 0;
        int lenToCopy = 0;
        int leftToCopy = length;
        
        while(targetOffset < newTargetOffset && readLen != -1) {
            if((offset + leftToCopy) < endIndex) {
                readLen = fillBuffer();
                if(readLen != -1) {
                    int amountBuffered = endIndex - offset;
                    lenToCopy = (amountBuffered < leftToCopy)? amountBuffered:leftToCopy;
                    System.arraycopy(buffer, offset, cbuf, targetOffset, lenToCopy);
                    noReads = false;
                    offset += lenToCopy;
                    targetOffset += lenToCopy;
                    leftToCopy -= lenToCopy;
                    totalRead += lenToCopy;
                }
            }
        }
        
        if(noReads) {
            totalRead = -1;
        }

        return totalRead;
    }

ReadLine()

    public String readLine() throws IOException {
        StringBuilder line = new StringBuilder();
        boolean foundCR = false;
        boolean foundLinefeed = false;
        boolean foundBoth = false;
        int readLen = 0;
        final char LINEFEED = '\n';
        final char CR = '\r';
        
        if(offset == endIndex) {
            readLen = fillBuffer();
        }
        
        while(!(foundCR || foundLinefeed || foundBoth) && readLen != -1) {
            if(buffer[offset] == CR) {
                foundCR = true;
                offset++;
                if(offset == endIndex) {
                    readLen = fillBuffer();
                    if(readLen != -1) {
                        if(buffer[offset] == LINEFEED) {
                            foundBoth = true;
                            offset ++;
                        } 
                    }
                } else if(buffer[offset] == LINEFEED) {
                    foundBoth = true;
                    offset++;
                } 
            } else if(buffer[offset] == LINEFEED) {
                foundLinefeed = true;
                offset ++;
            } else {
                line.append(buffer[offset]);
                offset++;
            }
            
            if(offset == endIndex) {
                readLen = fillBuffer();
            }
        }
        
        if(line.length() == 0) {
            return null;
        }
        
        return line.toString();
    }

FillBuffer()

    private int fillBuffer() throws IOException {
        int length;
        int lenRead = 0;
        long newOffset = offset + buffer.length;
        if(newOffset >= endIndex) {
            moveLeftoverToBeginning();
            endIndex = endIndex - offset;
            offset = 0;
            
            length = bufferSize - endIndex;
            lenRead = in.read(buffer, endIndex, length);
            if (lenRead != -1) {
               endIndex += lenRead;
            } else {
                //endIndex = offset;
            }
        } else if(newOffset < endIndex) {
            lenRead = 0;
        }
        return lenRead;
    }

Conclusion

In this blog entry, a custom BufferedReader is discussed. The reader includes EOF as a line terminator. This is to facilitate verifying the output of a CEF formatted syslog message. The link to see the rest of this BufferedReader and its tests, download the Maven project via git at https://github.com/darylmathison/buffered-reader-example.

The Blog

Daryl Mathison’s journey in the world of technology as a senior software developer

Latest episodes

AWS Lambda for Stock Data: What I Learned (and Fixed!) After Going Live

June 11, 2025
Tired of Manually Downloading Stock Data? Automate it with Serverless Magic!

June 1, 2025
There is a Mojo in my Dojo II: Automate Code Formatting in CI/CD Pipelines

May 12, 2025
Creating Passive Income Through Dividend Stocks

February 17, 2025