Development

In the example below we build on the Reading a zip file from java using ZipInputStream page to provide basic filtering. This filtering is provided by the filteredExpandZipFile method taking a Predicate. Every ZipEntry is passed to the predicate, but only ones that match (predicate returns true) are included.

Note that the size of an entry cannot be accurately determined in all cases, so it is not safe to perform validation on the this field.

package com.thecoderscorner.example.compression;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.time.Instant;
import java.time.temporal.ChronoField;
import java.util.Date;
import java.util.function.Predicate;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

/**
 * ZipFilteredReader shows an example of filtering one or more matching
 * files from a ZipInputStream. Instead of expanding the whole archive
 * this uses the Function interface to only expand matching files.
 *
 * Files are output to the OUTPUT_DIR directory.
 */
public class ZipFilteredReader
{
    private static final Logger LOGGER = Logger.getLogger("ZipReader");

    private final Path zipLocation;
    private final Path outputDirectory;

    /**
     * Here we create the ZipFilteredReader and configure it with a Predicate.
     * This predicate function is used to filter which files we want to copy
     * out of the zip file.
     */
    public static void main(String args[])
    {
        ZipFilteredReader reader = new ZipFilteredReader("c:/dev/temp/ziptest/output.zip", "c:/dev/temp/ziptest");

        // We define a simple predicate that only extracts files ending in
        // .txt from the zip archive and pass it to the zip filter method.
        reader.filteredExpandZipFile(zipEntry -> zipEntry.getName().endsWith(".txt"));

        // Another example predicate that filters files last written in or
        // after year 2015. Uncomment below to try this filter.
        //reader.filteredExpandZipFile(zipEntry -> {
        //    Instant lastModified = zipEntry.getLastModifiedTime().toInstant();
        //    int yearModified = lastModified.get(ChronoField.YEAR);
        //    return yearModified >= 2015;
        //});
    }

    /**
     * Constructs the filtered zip reader passing in the zip file to
     * be expanded by filter and the output directory
     * @param zipLocation the zip file
     * @param outputDir the output directory
     */
    public ZipFilteredReader(String zipLocation, String outputDir) {
        this.zipLocation = Paths.get(zipLocation);
        this.outputDirectory = Paths.get(outputDir);
    }

    /**
     * This method iterates through all entries in the zip archive. Each
     * entry is checked against the predicate (filter) that is passed to
     * the method. If the filter returns true, the entry is expanded,
     * otherwise it is ignored.
     * @param filter the predicate used to compare each entry against
     */
    private void filteredExpandZipFile(Predicate filter) {
        // we open the zip file using a java 7 try with resources block
        try(ZipInputStream stream = new ZipInputStream(new FileInputStream(zipLocation.toFile())))
        {
            LOGGER.info("Zip file: " + zipLocation.toFile().getName() + " has been opened");

            // we now iterate through all files in the archive testing them
            // again the predicate filter that we passed in. Only items that
            // match the filter are expanded.
            ZipEntry entry;
            while((entry = stream.getNextEntry())!=null)
            {
                if(filter.test(entry)) {
                    LOGGER.info("Matched file " + entry.getName());
                    extractFileFromArchive(stream, entry.getName());
                }
                else {
                    LOGGER.info("Skipping file:  " + entry.getName());
                }
            }
        }
        catch(IOException ex) {
            LOGGER.log(Level.SEVERE, "Exception reading zip", ex);
        }
    }

    /**
     * We only get here when we the stream is located on a zip entry.
     * Now we can read the file data from the stream for this current
     * ZipEntry. Just like a normal input stream we continue reading
     * until read() returns 0 or less.
     */
    private void extractFileFromArchive(ZipInputStream stream, String outputName) {
        // build the path to the output file and then create the file
        String outpath = outputDirectory + "/" + outputName;
        try (FileOutputStream output = new FileOutputStream(outpath)) {

            // create a buffer to copy through
            byte[] buffer = new byte[2048];

            // now copy out of the zip archive until all bytes are copied
            int len;
            while ((len = stream.read(buffer)) > 0)
            {
                output.write(buffer, 0, len);
            }
        }
        catch(IOException e) {
            LOGGER.log(Level.SEVERE, "Exception writing file", e);
        }
    }
}

Listing of the zip file that we used:

$ unzip -l output.zip
Archive:  output.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
      566  09-25-2014 15:21   LICENSE.txt
      661  09-25-2014 15:21   NOTICE.txt
       26  05-16-2015 17:18   output.log
---------                     -------
     1253                     3 files

We can see that only the .txt files were output to the directory:

$ ls
LICENSE.txt  NOTICE.txt  output.zip

Lastly, here's the log output:

May 16, 2015 5:21:47 PM com.thecoderscorner.example.compression.ZipFilteredReader filteredExpandZipFile
INFO: Zip file: output.zip has been opened
May 16, 2015 5:21:47 PM com.thecoderscorner.example.compression.ZipFilteredReader filteredExpandZipFile
INFO: Matched file LICENSE.txt
May 16, 2015 5:21:47 PM com.thecoderscorner.example.compression.ZipFilteredReader filteredExpandZipFile
INFO: Matched file NOTICE.txt
May 16, 2015 5:21:47 PM com.thecoderscorner.example.compression.ZipFilteredReader filteredExpandZipFile
INFO: Skipping file:  output.log

In this example I demonstrate how to use ConcurrentMap along with Future to generate a lazy cache. Concurrent maps are a good choice in some situations, where absolute atomicity can be traded off for performance, some level of control over ordering of events has been traded for performance.

Following on from one of our popular articles on Reading a zip file from java using ZipInputStream we've put together a new article on how to create a zip archive using Java. The below example uses ZipOutputStream to create a zip file from all the items in a directory.

Zip files are written slightly differently to a normal stream in that each entry is put into the stream one at a time. So the procedure is as follows:

  1. Create a zip archive using ZipOutputStream
  2. Create a new ZipEntry to represent the file to be added
  3. Write the bytes for the file
  4. Repeat steps 2,3 for each file to be added
  5. Close the archive

This is a question as much as a discussion, I can’t find a lot of detail on the state of play in this area and would really welcome any feedback or corrections. Please don’t read this as a negative article as that is not how it is intended; it’s in the optimisation section but I’m not sure what impact it really has. One thing for sure, for 99% of systems you probably don’t have to worry about what’s going on here at all.

Recently I started to wonder how Garbage collection events in large heaps affect cache performance. Earlier today I read an interesting article on the mechanical sympathy blog (Mechanical Sympathy: CPU flushing article ) about the way processor caches work, and it re-enforced a feeling I’ve had for a while; that old generation GC’s on very large heaps could cause a lot of cache misses because each object in the generation has to be marked and swept. This led to more searches where I dug up this page on stack overflow (Stack Exchange: is GC cache friendly).

Over the years there have been no shortage of ways to format a string in java. What with the + operator, StringBuffer, StringBuilder, String.format(..) and various specialised formatters for numbers and dates we sometimes feel a little spoilt for choice. But how do they all work and what are their advantanges / disadvantages?

There are several ways to format dates in Java, but by far the easiest is to use DateFormat. Creating a DateFormat is very similar to NumberFormat that we saw on the previous page. Here are the static factory methods called directly on the DateFormat class:

Following on from Setting up role based security in tomcat, we now switch from using a memory realm to one backed by a database. Memory realms are great for testing but in any real application is would probably not be acceptable. Normally user credentials are stored in a database, so for this purpose there is a realm based on a datasource.

Depending on your view of things, you will either edit server.xml in the tomcat-home/conf directory, or you will edit the context file for the specific application. There are a few choices here, all of which are explained in the tomcat 7.0 documentation. For the purpose of this article we will edit server.xml.

CountDownLatch provides a means of waiting for a number of asynchronous events before proceeding. In order to do this one constructs a latch providing the event count. Then one thread would normally call await Whilst the other thread calls countDown.  Once the count reaches zero the await call returns and the latch is set. If the call to await happens after the latch is set it returns immediately.

In our example we need to wait for a thread to initialise before proceeding. We achieve this by creating a count down latch with a count of 1. Once the thread has done its work, it calls countDown on the latch. In the mean time the main thread has continued to do its longJob and then called await on the CountDownLatch instance. Calling await blocks until countDown has been called enough times (in this case once).

In this entry I show how to use the inbuilt Java XMLStreamReader PULL parser class to read an XML file. The XML stream libraries are PULL based XML parsers that do not load the whole document into a memory structure, so therefore are more suited to large volumes of XML.

Below is an example XML file for a zoo, it contains Animal data types that have both attributes and data. It is kept simple for the sake of example. To run the example, copy this XML into the ROOT of your classpath.

Following on from Setting up role based security in tomcat we now look at accessing the realm security information from code. Although tomcat takes care of authenticating users at the right time, there are still times when we need to programatically access the credential information. For example the following snippet from userProfile.jsp is a mixed mode page In that anyone can view the page, but some users with manager role see more information.

To do this we use a method on the request object. request.isUserInRole(roleName);.Below is an example of its usage from the userProfile page.

Subcategories

This area of the website discusses JVM optimisation and analysis techniques. This subject tends to trigger quite strong reactions in people, and it's not unusual for there to be many differing opinions on this subject.

My aim is to provide articles that are based on evidence and follow best practice.