On the Intersection of Context-Free and Regular Languages


Loading...

Date

2023-05

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

The Bar-Hillel construction is a classic result in formal language theory. It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free. In the construction, the regular language is specified by a finite-state automaton. However, neither the original construction (Bar-Hillel et al., 1961) nor its weighted extension (Nederhof and Satta, 2003) can handle finite-state automata with $\varepsilon$-arcs. While it is possible to remove $\varepsilon$-arcs from a finite-state automaton efficiently without modifying the language, such an operation modifies the automaton's set of paths. We give a construction that generalizes the Bar-Hillel in the case where the desired automaton has $\varepsilon$-arcs, and further prove that our generalized construction leads to a grammar that encodes the structure of both the input automaton and grammar while retaining the asymptotic size of the original construction.

Publication status

published

Book title

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Journal / series

Volume

Pages / Article No.

737 - 749

Publisher

Association for Computational Linguistics

Event

17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Formal Languages and Automata Theory (cs.FL); Computation and Language (cs.CL); FOS: Computer and information sciences

Organisational unit

09682 - Cotterell, Ryan / Cotterell, Ryan check_circle

Notes

Funding

Related publications and datasets

Is new version of:
Is supplemented by: https://github.com/rycolab/bar-hillelIs new version of: 10.48550/arXiv.2209.06809