Resources For IT Professionals

# How to Query Trees Using LINQ

A (directed) tree is one of the most common data types. But how do you use trees in LINQ? This article introduces a generic LINQ-Extension which solves the problem, gives an example and discusses alternatives.

# Motivation

Starting point of this article was a simple question in the LINQ forum which has been unanswered for four years...

How can I select tree nodes independently from their nesting level?

So it might be worth to have a closer look at the task. Let's start with an example. A simple TreeNode class may look like this:

`public` `class` ``` TreeNode```
`{`
`    ``public` `string` ``` Id { ````get````; ````set``; }`
`    ``public` `int` ``` Data { ````get````; ````set``; }`
`    ``public` `List<TreeNode> Children { ``get````; ````private` ``` set````; }`

`    ``public` `TreeNode(``string` `id, ``int` `data)`
`        ````: ````this````(id, data, ````new` ``` TreeNode[0])```
`    ``{ }`

`    ``public` `TreeNode(``string` `node, ``int` `data, ``params` `TreeNode[] children)`
`    ``{`
`        ``Id = node;`
`        ``Data = data;`
`        ````Children = ````new` ``` List<TreeNode>(children);```
`    ``}`
`}`

The test data are:

`TreeNode rootNode =`
`    ``new` `TreeNode(``"node-0"``, 35,`
`        ``new` `TreeNode(``"node-1"``, 17,`
`            ``new` `TreeNode(``"node-2"``, 20)),`
`        ``new` `TreeNode(``"node-3"``, 19),`
`        ``new` `TreeNode(``"node-4"``, 5,`
`            ``new` `TreeNode(``"node-5"``, 25),`
`            ``new` `TreeNode(``"node-6"``, 40)));`

# Solution

Implement a generic extension class which can traverse over arbitrary trees. The only precondition is that the node's children are stored as an IEnumerable. The selector lambda expression avoids hard-coding of the access path and makes the extension reusable.

`public` `static` ``` class``` `LinqTreeExtension`
`{`
`    ``public` `static` ``` IEnumerable<T> SelectNestedChildren<T>```
`        ``(``this` `IEnumerable<T> source, Func<T, IEnumerable<T>> selector)`
`    ``{`
`        ``foreach` `(T item ``in` `source)`
`        ``{`
`            ````yield ````return` ``` item;```
`            ``foreach` `(T subItem ``in` `SelectNestedChildren(selector(item), selector))`
`            ``{`
`                ````yield ````return` ``` subItem;```
`            ``}`
`        ``}`
`    ``}`
`}`

The extension method SelectNestedChildren traverses the tree depth-first, left to right.

# Usage Example

Example 1:
Flattening the tree structure  (without the root node)

```var flattenedTreeWithoutRoot ```
`    ``= rootNode.Children`
`        ``.SelectNestedChildren(t => t.Children).ToList();`

The resulting list enumerates the nodes with IDs "note-1" to "note-6". Please note:  The root node is not contained in the resulting list. We'll fix this in the next example.

Example 2:
Flattening the tree structure  (including the root node)

```var flattenedTreeIncludingRoot ```
`    ````= ````new` ``` TreeNode[] {rootNode }```
`        ``.SelectNestedChildren(t => t.Children).ToList();`

The query is executed on an array, which contains only the root node. The resulting list enumerates the nodes with IDs "note-0" to "note-6" as expected.

Example 3: Performing a LINQ query on all nested children

`var result`
`    ````= ````new` ``` TreeNode[] { rootNode }```
`        ``.SelectNestedChildren(t => t.Children)`
`        ``.Where(t => t.Data > 30).ToList();`

The resulting list contains two nodes with the IDs "node-0" and "node-6"

The proposed LINQ extension has significant pros:
• The extension is applicable to all tree-structures which use an IEnumerable to hold the children.
• It fits seamlessly into other LINQ operations.
However, in some cases you may want to consider an alternative:

## Cyclic graph instead of a tree

Problem: The extension will loop forever if it runs into a cycle of the graph.
Solution: Modify the extension:
• Add a HashSet<T> to the extension which keeps track of the already visited notes.
• Your class T should override GetHashCode() and Equals()
• The extensions has to skip already visited nodes.

## Persisted tree in a database

Problem: If the tree is stored in a database (and not in memory), the extension produces unnecessary overhead.
Solution: All tree nodes reside in a database table which can be queried directly. The effect that they represent a logical tree or even a cyclic graph can be ignored. Using the Entity Framework the query looks like this:

`var databaseResult = context.TreeNodes`
`    ``.Where(t => t.Data > 30).ToList();`

However, at least two small changes should be applied to the sample class TreeNode:
• The property Children should be changed to a virtual property to enable lazy loading.
• The Entity Framework requires an additional parameterless constructor.

## XML data structures

An XML data structure is a special case of an directed tree.

Problem: LINQ to XML has specialized on XML Documents.
Solution: Use LINQ to XML. It reduces your coding effort. You stick to a standard. You may also use other benefits from LINQ to XML. Example:

`var xmlElements = XDocument.Load(``"myfile.xml"``).Descendants();`

`var result = xmlElements.Where(x => ...);`

XDocument.Descendents() returns an IEnumerable<XElement> which can be used in a query directly.

Conclusion:
Using trees with LINQ is simple. However, there is no "one size fits all" solution.

# Other languages

Wiki - Revision Comment List(Revision Comment)
Sort by: Published Date | Most Recent | Most Useful
• Carsten Siemens edited Revision 13. Comment: Added section: "Other languages". Added tags: Multi Language Wiki Articles, has Other Languages

• Carsten Siemens edited Revision 12. Comment: Fixed spelling - community voted 2:1 for "contradiction" instead of "contraindication"

• Carsten Siemens edited Revision 10. Comment: Fixed formatting in section "Links"

• Carsten Siemens edited Revision 9. Comment: Reset the title of a chapter to "Contraindications and alternatives" instead of "Contradictions and alternatives"

• Naomi  N edited Revision 8. Comment: Nice article

• Carsten Siemens edited Revision 5. Comment: Fixed size of a code box

• Carsten Siemens edited Revision 3. Comment: Added tag: "Carsten Siemens"

• Carsten Siemens edited Revision 2. Comment: Fixed tag: Best Practice

Page 1 of 2 (11 items) 12
Wikis - Comment List
Sort by: Published Date | Most Recent | Most Useful
Posting comments is temporarily disabled until 10:00am PST on Saturday, December 14th. Thank you for your patience.
• Carsten Siemens edited Original. Comment: Corrected case in title

• Carsten Siemens edited Revision 2. Comment: Fixed tag: Best Practice

• Carsten Siemens edited Revision 3. Comment: Added tag: "Carsten Siemens"

• Carsten Siemens edited Revision 4. Comment: Fixed misspelling

• Carsten Siemens edited Revision 5. Comment: Fixed size of a code box

• Naomi  N edited Revision 8. Comment: Nice article

• Carsten Siemens edited Revision 9. Comment: Reset the title of a chapter to "Contraindications and alternatives" instead of "Contradictions and alternatives"

• Carsten Siemens edited Revision 10. Comment: Fixed formatting in section "Links"

• Can you please explain the usage of the word contraindication www.thefreedictionary.com/contraindication in the context of this article?

• Hello Naomi,

according to Merriam-Webster - www.merriam-webster.com/.../contraindication - a contraindication is

"something (as a symptom or condition) that makes a particular treatment or procedure inadvisable".

Let’s map this definition to the article:

* The “treatment” is the class LinqTreeExtension which is explained in the “Solution” section.

* A condition which makes this “treatment” inadvisable is for example a cyclic graph.

The term contraindication is most often used by physicians and pharmacists. But (at least in Germany) it is not limited to medicine.

I like a figurative language, but I’m not a native speaker. You may have better synonyms.

Naomi, I really appreciate your careful editing. Thank you!

• I am also not a native speaker, so when I saw that word, I had to check it up. It may be nice if an English native speaker will verify if it is indeed permissible to be used in this context. That's why I originally changed it to more often and commonly used contradiction although may be just Pros and Cons may be used

• Naomi is correct, it's ConTRADiction. ConTRAINdication is a rarely used medical term.

Simple typo IMO.

"Contradiction" is very much the right word.

Regards,

Pete