Categories
Engineering Java

Your System is Finite

Let us talk limits for a bit. No, put away your graphs, this is about memory
limits. Before I get into the discussion, a few notes. I am going to be talking
mostly about particular errors and operations in Java. A healthy knowledge of
the Java Virtual
Machine
with particular attention to
garbage collection and heap allocation
would be useful.

Recently I encountered some issues where our application was running out of
memory. In this case the Java virtual machine is going to throw this exception:
OutOfMemoryException.

Why is this happening?

When your program runs in the JVM the JVM has a specific amount of memory (a
subset of the system’s ram) it can use to allocate objects. This area is the
heap and it is where all the objects you create using the ‘new’ keyword will
live. Stuff like this:

CustomObject customObject = new CustomObject();

The amount of heap space the JVM has access to is based on the system it is
run on. The JVM will set the amount of heap space available to it when it
starts. As your program executes, the JVM will allocate space on the heap for
objects the program needs. Objects you no longer need will be removed via the
JVM’s garbage collector. The garbage collector however, cannot remove objects
still in use. Therefore if you create too many objects you will use up all the
heap space and get an OutOfMemory exception. This is the key point: your
system is finite
.

For example, if your heap size is 256MBs and each object you create takes
1MB then code like this:

CustomObject customObjArray =
getNumberOfCustomObjects(257)

Will require at least 257MBs which you do not have. Thus the JVM cannot
continue and it throws an exception.

How Can We Fix This?

Now there is an easy fix to this. When you start your java program you run
it on the console like this:

java myProgram

You can specify
arguments
on java to increase the heap size. Specifically something like
this:

java -Xmx:1g myProgram

This would set the JVM’s heap size to 1 gigabyte. Do not do this! It is
tempting, it is easy, but it covers up an underlying problem in your program.
Now, of course, there is a time and place for this, but for the most part all
you are doing is delaying the problem, forcing future you to deal with it.
Future you will hate you for it. Let us look at a close to real life
example.

Record recordArray =
getAllRecordsFromMySql();

This is perfectly valid code. All we are doing here is getting a bunch of
objects back from the database. Simple code can get you into a lot of trouble
if you do not have a good understanding of how many records you can get back.
We have three possible cases here:

1. We get 0 items back.

2. We get n items which consumes less than the total heap size.

3. We get n items which needs more heap than we have.

In the first case, we are fine, no issues there. In the second case we are
also fine. You may not even have to worry about that if the number of objects
you expect back is always (key word) going to be be less than your heap size.
For example if our app can only ever store 10 records at a time (like a
rotating log) we will be fine. If, as I saw recently, the number of records you
could get back is near to unlimited then I can promise you at some point that
is going to happen. You can then see that it does not matter what we set our
heap size to. If the amount of records are bound to the size of system’s hard
disk then, at some point, the amount of stored records we have will exceed the
total amount of available ram and then it does not matter what you set the
JVM’s heap size to, you could not possibly set it high enough and your program
explodes.

What you can do however is limit the amount of records you process at one
time. Consider a couple different strategies for this case (reading n entries
from a database).

1. Conditionally Limiting your Mysql query to a certain id range.

If your tables are setup correctly it is possible you could get your rows
based on increasing the id. For example you could execute a query like
this:

select * from records where id >= 0 and <= 500;

This would get all your records with ids from 0 to 500, or 500 records.
Assuming ids are unique (meaning two rows cannot have the same id). Then you
could iterate your id range and get the next 500. Keep doing that until you
have less than 501 results and you have processed all your results.

2. Using a cursor

A cursor is basically an iterator. It points to one row in your database.
From there you can move to another row. You are only allowed to operate on one
row at a time so you will never run out of memory. This can be especially
useful if you want to process a large number of objects, and then update them
in the database. Code that would look like this:

while(mysqlCursor.hasNext()) {

       Record record = mysqlCursor.next();
       processRecord(record); 
       writeRecordToTable(record); 

}

Both strategies will limit the amount of data you have to manage at any one
time.

Key Takeaways

The key rule to remember here is, when working with computers, you are
always working with finite resources. Memory has limits. You can program around
them easy enough, but they are there and they must be respected.

Leave a Reply