3 ways to convert a path into cumulative segments in JavaScript

Published on in JavaScript and Regular expressions

Last updated on

How to convert '/foo/bar/baz/' into an array of '/', '/foo/', '/foo/bar/', and '/foo/bar/baz/'. Example use case: building breadcrumb navigations.

Table of contents

The problem

Given a path like /foo/bar/baz/, we want to convert it into an array like this:

['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

Starting from the left, each array item grows until the next forward slash in the path. The path could be a directory path or an URL pathname.

I needed this when creating breadcrumb navigations for this website, so the paths were URL pathnames.

The 3 solutions

Split and reduce

This was my initial solution:

const getCumulativePathSegments = (path) =>
  path
    .split('/')
    .filter(Boolean) // Drop empty strings caused by the splitting
    .reduce(
      (segments, segment) => {
        const previous = segments[segments.length - 1]
        segments.push(`${previous}${segment}/`)
        return segments
      },
      ['/']
    )

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

Simplistic, but works.

One caveat though: the last array item will have a trailing slash even if the input string doesn't have a trailing slash. This may or may not be a problem in your case.

If you want to omit the first array item (the slash):

  • set '/' as the default value of previous
  • change the initial array from ['/'] to [].
 const getCumulativePathSegments = (path) =>
   path
     .split('/')
     .filter(Boolean)
     .reduce(
       (segments, segment) => {
-        const previous = segments[segments.length - 1]
+        const previous = segments[segments.length - 1] || '/'
         segments.push(`${previous}${segment}/`)
         return segments
       },
-      ['/']
+      []
     )

 const pathWithTrailingSlash = '/foo/bar/baz/'
 const pathWithoutTrailingSlash = '/foo/bar/baz'

 getCumulativePathSegments(pathWithTrailingSlash)
-//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']
+//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']

 getCumulativePathSegments(pathWithoutTrailingSlash)
-//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']
+//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']

Test the split and reduce solution on flems.io

Map with a bit of curry

Who doesn't like curry? It makes chicken taste so much better. Currying also makes this cool solution possible:

const getCumulativePathSegments = (path) => {
  const cumulativePath = (acc) => (value) => (acc += `${value}/`)
  return path.split('/').filter(Boolean).map(cumulativePath('/'))
}

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']

Wait, what? Let's see how it works by adding console loggings:

const getCumulativePathSegments = (path) => {
  const cumulativePath = (acc) => (value) => {
    console.log({ acc, value })
    return (acc += `${value}/`)
  }
  return path.split('/').filter(Boolean).map(cumulativePath('/'))
}

getCumulativePathSegments('/foo/bar/baz')
//=> { acc: '/',         value: 'foo' }
//   { acc: '/foo/',     value: 'bar' }
//   { acc: '/foo/bar/', value: 'baz' }

The outer function's acc parameter acts as an accumulator, like when using an array reducer.

We are reassigning the parameter, which by the way violates the ESLint rule no-param-reassign. This is not a problem per se, but it's a potential source of confusion.

We could avoid reassigning the parameter by getting rid of the currying and creating an acc variable:

 const getCumulativePathSegments = (path) => {
-  const cumulativePath = (acc) => (value) => {
+  let acc = '/'
+  const cumulativePath = (value) => {
     return (acc += `${value}/`)
   }
   return path.split('/').filter(Boolean).map(cumulativePath)
 }

There's one more oddity: the cumulativePath function's return statement contains an addition assignment. This violates the ESLint rule no-return-assign, which again is not a problem per se but is confusing.

Assignment operators have return values, meaning that these two are effectively equivalent:

const cumulativePath = (acc) => (value) => (acc += `${value}/`)

const cumulativePath = (acc) => (value) => {
  acc += `${value}/`
  return acc
}

With all the magic stripped out, here's the full function:

const getCumulativePathSegments = (path) => {
  let acc = '/'
  const cumulativePath = (value) => {
    acc += `${value}/`
    return acc
  }
  return path.split('/').filter(Boolean).map(cumulativePath)
}

Or maybe in a clearer form:

const getCumulativePathSegments = (path) => {
  let acc = '/'
  return path
    .split('/')
    .filter(Boolean)
    .map((value) => {
      acc += `${value}/`
      return acc
    })
}

Now it's quite easy to see what's happening. Though it's not as cool anymore, is it?

This is also very similar to the split and reduce solution. Both solutions have the same caveat: the last array item will have a trailing slash even if the input string doesn't have a trailing slash.

Here's one way to have a slash as the first item of the resulting array:

 const getCumulativePathSegments = (path) => {
   const cumulativePath = (acc) => (value) => (acc += `${value}/`)
-  return (
+  return ['/'].concat(
     path.split('/').filter(Boolean).map(cumulativePath('/'))
   )
 }

 const pathWithTrailingSlash = '/foo/bar/baz/'
 const pathWithoutTrailingSlash = '/foo/bar/baz'

 getCumulativePathSegments(pathWithTrailingSlash)
-//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']
+//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

 getCumulativePathSegments(pathWithoutTrailingSlash)
-//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']
+//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

Test the map and curry solution on flems.io

All credit of the cleverness goes to Nina Scholz's post on Stack Overflow which I found via another post on Stack Overflow.

Regex to the rescue

If all you have is a hammer, everything looks like a nail. What I mean is that this problem looks like it's made to be solved with a regular expression!

Basically, when we encounter a forward slash or the end of the string ((\/|$)), we want to capture it plus everything before that (.*).

We could try something like this:

const getCumulativePathSegments = (path) => path.match(/.*(\/|$)/g)

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/foo/bar/baz/', '']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/foo/bar/baz', '']

Or a lazy quantifier:

const getCumulativePathSegments = (path) => path.match(/.*?(\/|$)/g)

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/', 'foo/', 'bar/', 'baz/', '']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/', 'foo/', 'bar/', 'baz', '']

Or a positive lookbehind assertion:

const getCumulativePathSegments = (path) => path.match(/(?<=.*)(\/|$)/g)

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/', '/', '/', '/', '']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/', '/', '/', '']

All three regexes fail miserably because we have to do overlapping matches, and this is not how they are done. Check out my previous blog post How to do overlapping matches with regular expressions for a walkthrough.

To do overlapping matches, we have to use matchAll() and a positive lookbehind assertion with a capturing group inside of it:

const getCumulativePathSegments = (path) =>
  Array.from(path.matchAll(/(?<=(.*(\/|$)))/g))

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> [
//     ["", "/", "/", index: 1, input: "/foo/bar/baz/", groups: undefined],
//     ["", "/foo/", "/", index: 5, input: "/foo/bar/baz/", groups: undefined],
//     ["", "/foo/bar/", "/", index: 9, input: "/foo/bar/baz/", groups: undefined],
//     ["", "/foo/bar/baz/", "/", index: 13, input: "/foo/bar/baz/", groups: undefined],
//   ]

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> [
//     ["", "/", "/", index: 1, input: "/foo/bar/baz", groups: undefined],
//     ["", "/foo/", "/", index: 5, input: "/foo/bar/baz", groups: undefined],
//     ["", "/foo/bar/", "/", index: 9, input: "/foo/bar/baz", groups: undefined],
//     ["", "/foo/bar/baz", "", index: 12, input: "/foo/bar/baz", groups: undefined],
//   ]

Looks good. Let's get only the second array items, i.e. the matches from the capturing group inside the lookbehind assertion:

const getCumulativePathSegments = (path) =>
  Array.from(
    path.matchAll(/(?<=(.*(\/|$)))/g),
    (match) => match[1]
  )

const pathWithTrailingSlash = '/foo/bar/baz/'
const pathWithoutTrailingSlash = '/foo/bar/baz'

getCumulativePathSegments(pathWithTrailingSlash)
//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']

getCumulativePathSegments(pathWithoutTrailingSlash)
//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz']

Egg-cellent! As you can see, whether the last item of the resulting array has a trailing slash depends on whether the input string has a trailing slash. This is better than the other two solutions; they always include a trailing slash in the last item of the resulting array.

If you want to omit the first array item (the slash), you need to only change .* to .+:

 const getCumulativePathSegments = (path) =>
-  Array.from(path.matchAll(/(?<=(.*(\/|$)))/g), (match) => match[1])
+  Array.from(path.matchAll(/(?<=(.+(\/|$)))/g), (match) => match[1])

 const pathWithTrailingSlash = '/foo/bar/baz/'
 const pathWithoutTrailingSlash = '/foo/bar/baz'

 getCumulativePathSegments(pathWithTrailingSlash)
-//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz/']
+//=> ['/foo/', '/foo/bar/', '/foo/bar/baz/']

 getCumulativePathSegments(pathWithoutTrailingSlash)
-//=> ['/', '/foo/', '/foo/bar/', '/foo/bar/baz']
+//=> ['/foo/', '/foo/bar/', '/foo/bar/baz']

Test the regex solution on flems.io

Which solution is the best?

The split and reduce solution might be the most straightforward solution. It's not as elegant as the other two, but at least it's readable.

The map and curry solution is fancy – actually, it's so fancy that I still find it quite confusing. It seems to work, but personally I wouldn't use it.

The regex solution is the shortest solution, but requires the reader to know some regex trickery. It's also the only solution which respects the input string's trailing slash. You could modify the other two solutions to do that too, but they would become more complex and less readable.

I'm a rebel, so I'm using the regex solution on this website. 🤘

Bonus: Using for breadcrumb navigations

Here's a quick example using React how the getCumulativePathSegments() function could be used for generating breadcrumb navigations:

const pageNames = {
  '/': 'Home',
  '/foo/': 'Foo',
  '/foo/bar/': 'BAR',
  '/foo/bar/baz/': 'Bäz',
}

function Breadcrumb() {
  const pageUrl = '/foo/bar/baz/' // or e.g. `window.location.pathname`

  const items = getCumulativePathSegments(pageUrl).map((segment) => ({
    href: segment,
    text: pageNames[segment],
  }))

  return (
    <nav aria-label="Breadcrumb">
      <ul>
        {items.map((item) => (
          <li key={item.href}>
            <a href={item.href}>{item.text}</a>
          </li>
        ))}
      </ul>
    </nav>
  )
}

Result:

<nav aria-label="Breadcrumb">
  <ul>
    <li><a href="/">Home</a></li>
    <li><a href="/foo/">Foo</a></li>
    <li><a href="/foo/bar/">BAR</a></li>
    <li><a href="/foo/bar/baz/">Bäz</a></li>
  </ul>
</nav>