Iteratively Processing Recursive Data, Such As Directories - TechNet Articles - United States (English) - TechNet Wiki


This article was originally published as the two articles Retrieving Hierarchically Recursive Data Iteratively and Processing Folders Iteratively that are also in my website

Iteratively Processing Recursive Data, Such As Directories


This article was originally published as the two articles Sam's Simple Samples.


Introduction

Many objects are recursive hierarchies (you can say, like the root system of a tree where roots have roots). The directories of a file system are recursive hierarchies since a directory can contain one or more sub-directories that can also contain one or more sub-directories. Windows of a GUI are recursive hierarchies since a window can contain client windows or controls (which are windows) that can contain controls or other windows. Engineering and manufacturing parts lists (such as for refrigerators and spacecraft) are recursive hierarchies. Organizational structures (employees) are recursive hierarchies.

Recursively hierarchical data can be retrieved and processed recursively. For each parent item, such as a directory or a window, the function processing the structure can call itself for each child. It is very easy to design a way to retrieve hierarchical data recursively.

This article does not explain the advantages and disadvantages of recursion and iteration. The purpose of this article is to explain how to iteratively process recursively hierarchical data with an example.

The "To Do" Stack

When processing data recursively, recursion uses a stack internally. Each time a function is called (recursively or not), the data for the calling function is pushed onto the stack. For a recursive call, part or all of that data is usually the parent item, such as the directory or window, whose children are to be processed.

The trick to processing data iteratively (not recursively) is to simply use a ("to do") stack for the data that has not yet been processed but needs to be processed. During the processing of a parent item, the children (that are then to be done) are stacked (using a Last-In-First-Out (LIFO) collection). The .Net Stack class can be used for that. Objects are pushed onto the stack and popped off the stack until all data is processed. The data is subsequently processed in reverse, in the sense that if "1", 2"", "3" and "4" pushed onto the stack in that order, they are processed as "4", "3", "2" and "1".

Example

Let us use the following data as an example.

  • Top
    • First
      • First-1
        • John
        • Mike
        • Steve
      • First-2
        • Susan
        • Julie
        • Kara
      • First-3
        • Kim
        • Kelly
        • Pat
    • Second
      • Second-1
      • Second-2
      • Second-3

When processing the data iteratively, the stack would be as follows, one line per iteration:

Top
First Second 
First-1 First-2 First-3 Second 
John Mike Steve First-2 First-3 Second 
Mike Steve First-2 First-3 Second 
Steve First-2 First-3 Second 
First-2 First-3 Second 
Susan Julie Kara First-3 Second 
Julie Kara First-3 Second 
Kara First-3 Second 
First-3 Second 
Kim Kelly Pat Second 
Kelly Pat Second 
Pat Second 
Second 
Second-1 Second-2 Second-3 
Second-2 Second-3 
Second-3 

In other words, we begin by pushing "Top" onto the stack. We then iterate (loop) by popping that off the stack and processing it. Processing includes pushing the children onto the stack. So the iteration continues until there are no more children to be processed; in other words, there is nothing more to process. Therefore as each item is processed, it is effectively replaced by its children, if any.

Implementation

The preceding example could be implemented with the following code that processes items of a class called "Hierarchy" that consists of:

class Hierarchy
{
    public string Name;
    public List Children = new List();
    public Hierarchy Parent = null;
     
    public Hierarchy(string Name, Hierarchy Parent)
    {
        this.Name = Name;
        this.Parent = Parent;
    }
}

The stack of items to be processed could then be:

Stack<Hierarchy> ToDo = new Stack<Hisp;public Hierarchy(erarchy>();

The data could be processed iteratively using:

ToDo.Push(Top);
while (ToDo.Count > 0)
{
    Hierarchy h = ToDo.Pop();
    Put(h, ToDo);
}

Where the following is the "Put" method:

private void Put(Hierarchy h, Stack ToDo)
{
    string tabs = new string('\t', Level);
    Console.WriteLine("{0}{1}", tabs, h.Name);
    // store the children in the to do stack
    if (h.Children == null)
        return;
    // since the stack is LIFO, the objects will be popped off in
    // reverse order from what they are pushed, so we will push
    // in reverse order
    for (int i = h.Children.Count - 1; i >= 0; --i)
        ToDo.Push(h.Children[i]);
}

See the sample code to see how "Level" is determined.

The following is an equivalent method for processing the data recursively:

private static void PutRecursively(Hierarchy h)
{
    string tabs = new string('\t', Level);
    Console.WriteLine("{0}{1}", tabs, h.Name);
    // store the children in the to do stack
    if (h.Children == null)
        return;
    foreach (Hierarchy child in h.Children)
        PutRecursively(child);
}

Sample Program: Retrieving Internet Explorer Favorites (Folders)

The sample program for this article retrieves the directories of the user's Internet Explorer favorites but could easily be adapted for other directories. Internet Explorer Favorites are often organized into many folders. This example does not retrieve the actual favorites; they are individual files within the directories. Reading the files (favorites) would be an easy addition.

To retrieve the directories, a class called "Folder" is used with the following members:

Name Type Description
Name string Unqualified folder name
Parent Folder Recursive reference to a parent folder
Attributes FileAttributes  
CreationTime DateTime  
LastAccessTime DateTime  
LastWriteTime DateTime  
Children List<Folder> Subdirectories
Level int Hierarchical depth, used only for formatting output
GetFolder Stack<Folder> ToDo, string root Gets the subdirectories of the directory
Qualify string Creates an absolute path
Unqualify string Gets just the unqualified name of a directory
PutFolder(Stack<Folder> ToDo, StreamWriter sw) void Writes the folder to the text file and queues the children

Note that the Name is just the unqualified name and does not include the qualification.

The Internet Explorer Favorites are usually stored in the "C:\Users\{User}\Favorites" directory, so Name in the Folder object for the top folder to be processed is just "Favorites". The Internet Explorer Favorites are usually stored in the "C:\Users\{User}\Favorites" directory. We use the following to get the "C:\Users\{User}" portion:

Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);

To begin the iteration, we first create a "Folder" object for the top folder, "Favorites ", and push it onto the stack. After pushing "Favorites" onto the stack, we begin the iteration that pops it off the stack and processes it. We process each object (folder) by retrieving its children and pushing each of them onto the stack of folders to do.

The sample program retrieves the folders into "Folder" objects then uses (shows) a separate loop to write them to a text file. Note that you will need to change the name of the file in the sample program.

Main Method

The following is the Main method of the console program:

string RootFolderName = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);
 
Folder TopFolder = new Folder();
TopFolder.Name = "Favorites";
ToDo.Push(TopFolder);
try {sw = new StreamWriter(OutFilename);}
catch (Exception ex)
{
string RootFolderName = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);
 
Folder TopFolder = new Folder();
TopFolder.Name = "Favorites";
    Console.WriteLine("Output file error: {0}", ex.Message);
    return;
}
// Get the folders
Console.WriteLine("\tGetting");
while (ToDo.Count > 0)
{
    (ToDo.Pop()).GetFolder(ToDo, RootFolderName);
    // The following will write something to the console
    // just so we know the program is working.
    Console.WriteLine("\tToDo.Count={0}", ToDo.Count);
}
// Put the folders. Start by creating the ToDo stack again.
Console.WriteLine("\tPutting");
ToDo.Push(TopFolder);
while (ToDo.Count > 0)
{
    (ToDo.Pop()).PutFolder(ToDo, sw);
    // The following will write something to the console
    // just so we know the program is working.
    Console.WriteLine("ToDo.Count={0}", ToDo.Count);
}
sw.Flush();

Where the "ToDo" stack is:

static Stack<Folder> ToDo = new Stack<Folder>();

And "sw" is a StreamWriter as in:

static StreamWriter sw = null;

Folder Class

The following is the "Folder" class:

class Folder
{
    public string Name;
    public Folder Parent = null;
    public FileAttributes Attributes { get; set; }
    public DateTime CreationTime { get; set; }
    public DateTime LastAccessTime { get; set; }
    public DateTime LastWriteTime { get; set; }
    public List<Folder> Children = new List<Folder>();
    int Level = 0;        // only for formatting output
 
    /// <summary>
    /// Gets the information for a folder including the subdirectories.
    /// A new object is created for each subdirectory and each of them
    int Level = 0;  &und-color:#ffffff;">    /// is added to the Children for this directory and to the ToDo stack.
    /// </summary>
    /// <param name="ToDo"></param>
    internal void GetFolder(Stack<Folder> ToDo, string root)
    {
        string Path = root + '\\' + Qualify();
        DirectoryInfo di = new DirectoryInfo(Path);
        Attributes = di.Attributes;
        CreationTime = di.CreationTime;
        LastAccessTime = di.LastAccessTime;
        LastWriteTime = di.LastWriteTime;
        List<string> dirs = new List<string>(Directory.EnumerateDirectories(Path));
        foreach (string fn in dirs)
        {
            Folder sf = new Folder();
            sf.Name = Folder.Unqualify(fn);
            sf.Parent = this;
            sf.Level = Level + 1;
            Children.Add(sf);
            ToDo.Push(sf);
        }
    }
 
    /// <summary>
    /// This will prefix the parent directory names separated
    /// by backslashes to create a fully qualified path. The
    /// result must be qualified by the root path.
    /// </summary>
    /// <param name="Folder"></param>
    /// <returns></returns>
    public string Qualify()
    {
        Folder TempFolder = this;
        Stack<string> names = new Stack<string>();
        names.Push(Name);
        while (TempFolder.Parent != null)
        {
            TempFolder = TempFolder.Parent;
            names.Push(TempFolder.Name);
        }
        return string.Join("\\", names.ToArray());
    }
 
;     TempFolder = TempFolder.Parent;
            names.Push(TempFolder.Name);
        <    /// <summary>
    /// Since Directory.EnumerateDirectories returns the fully qualified path
    /// of each directory, we need to get just the unqualified name of a directory.
    /// </summary>
    /// <param name="dir">fully qualified path</param>
    /// <returns>unqualified path</returns>
    public static string Unqualify(string dir)
    {
        int i = dir.LastIndexOf('\\');
        return dir.Substring(i + 1);
    }
 
    internal void PutFolder(Stack<Folder> ToDo, StreamWriter sw)
    {
        string tabs = new string('\t', Level);
        sw.WriteLine("{0}{1} {2} {3}", tabs, Name, CreationTime, LastAccessTime);
        // since the stack is LIFO, the objects will be popped off in
        // reverse order from what they are pushed, so we will push
        // in reverse order
        for (int i = Children.Count - 1; i >= 0; --i)
            ToDo.Push(Children[i]);
    }
}

See Also

Source Code

Download a complete project with source code from this article from the TechNet Gallery item Iteratively Processing Recursive Data, Such As Directories.