125 Problems in Text Algorithms: with Solutions

Přední strana obálky
Cambridge University Press, 2021 - Počet stran: 320
String matching is one of the oldest algorithmic techniques, yet still one of the most pervasive in computer science. The past 20 years have seen technological leaps in applications as diverse as information retrieval and compression. This copiously illustrated collection of puzzles and exercises in key areas of text algorithms and combinatorics on words offers graduate students and researchers a pleasant and direct way to learn and practice with advanced concepts. The problems are drawn from a large range of scientific publications, both classic and new. Building up from the basics, the book goes on to showcase problems in combinatorics on words (including Fibonacci or Thue-Morse words), pattern matching (including Knuth-Morris-Pratt and Boyer-Moore like algorithms), efficient text data structures (including suffix trees and suffix arrays), regularities in words (including periods and runs) and text compression (including Huffman, Lempel-Ziv and Burrows-Wheeler based methods).
 

Obsah

Combinatorial Puzzles
17
Stringologic Proof of Fermats Little Theorem
18
Simple Case of Codicity Testing
19
Magic Squares and the ThueMorse Word
20
OldenburgerKolakoski Sequence
22
SquareFree Game
24
Fibonacci Words and Fibonacci Numeration System
26
Wythoffs Game and Fibonacci Word
28
Longest CommonParity Factors
161
Word SquareFreeness with DBF
162
Generic Words of Factor Equations
164
Searching an Infinite Word
166
Perfect Words
169
Dense Binary Words
173
Factor Oracle
175
Regularities in Words
180

Distinct Periodic Words
30
A Relative of the ThueMorse Word
33
ThueMorse Words and Sums of Powers
34
Conjugates and Rotations of Words
35
Conjugate Palindromes
37
Many Words with Many Palindromes
39
Short Superword of Permutations
41
Short Supersequence of Permutations
43
Skolem Words
45
Langford Words
48
From Lyndon Words to de Bruijn Words
50
Pattern Matching
53
Border Table
54
Shortest Covers
56
Short Borders
58
Prefix Table
60
Border Table to the Maximal Suffix
62
Periodicity Test
65
Strict Borders
67
Delay of Sequential String Matching
70
Sparse Matching Automaton
72
ComparisonEffective String Matching
74
Strict Border Table of the Fibonacci Word
76
Words with Singleton Variables
78
OrderPreserving Patterns
81
Parameterised Matching
83
GoodSuffix Table
85
Worst Case of the BoyerMoore Algorithm
88
TurboBM Algorithm
90
String Matching with Dont Cares
92
Cyclic Equivalence
93
Simple Maximal Suffix Computation
96
SelfMaximal Words
98
Maximal Suffix and Its Period
100
Critical Position of a Word
103
Periods of Lyndon Word Prefixes
105
Searching Zimin Words
107
Searching Irregular 2D Patterns
110
Efficient Data Structures
111
List Algorithm for Shortest Cover
112
Computing Longest Common Prefixes
113
Suffix Array to Suffix Tree
115
Linear Suffix Trie
119
Ternary Search Trie
122
Longest Common Factor of Two Words
124
Subsequence Automaton
126
Codicity Test
128
LPF Table
130
Sorting Suffixes of ThueMorse Words
134
Bare Suffix Tree
137
Comparing Suffixes of a Fibonacci Word
139
Avoidability of Binary Words
141
Avoiding a Set of Words
144
Minimal Unique Factors
146
Minimal Absent Words
148
Greedy Superstring
152
Shortest Common Superstring of Short Words
155
Counting Factors by Length
157
Counting Factors Covering a Position
160
Three Square Prefixes
181
Tight Bounds on Occurrences of Powers
183
Computing Runs on General Alphabets
185
Testing Overlaps in a Binary Word
188
OverlapFree Game
190
Anchored Squares
192
Almost SquareFree Words
195
Binary Words with Few Squares
197
Building Long SquareFree Words
199
Testing Morphism SquareFreeness
201
Number of Square Factors in Labelled Trees
203
Counting Squares in Combs in Linear Time
206
Cubic Runs
208
Short Square and Local Period
210
The Number of Runs
212
Computing Runs on Sorted Alphabet
214
Periodicity and Factor Complexity
219
Periodicity of Morphic Words
220
Simple Antipowers
222
Palindromic Concatenation of Palindromes
224
Palindrome Trees
225
Unavoidable Patterns
227
Text Compression
230
BW Transform of ThueMorse Words
231
BW Transform of Balanced Words
233
Inplace BW Transform
237
LempelZiv Factorisation
239
LempelZivWelch Decoding
242
Cost of a Huffman Code
244
LengthLimited Huffman Coding
248
Online Huffman Coding
253
RunLength Encoding
256
A Compact Factor Automaton
261
Compressed Matching in a Fibonacci Word
264
Prediction by Partial Matching
266
Compressing Suffix Arrays
269
Compression Ratio of Greedy Superstrings
271
Miscellaneous
275
Binary Pascal Words
276
SelfReproducing Words
278
Weights of Factors
280
LetterOccurrence Differences
282
Factoring with BorderFree Prefixes
283
Primitivity Test for Unary Extensions
286
Partially Commutative Alphabets
288
Greatest FixedDensity Necklace
290
PeriodEquivalent Binary Words
292
Online Generation of de Bruijn Words
295
Recursive Generation of de Bruijn Words
298
Word Equations with Given Lengths of Variables
300
Diverse Factors over a ThreeLetter Alphabet
302
Longest Increasing Subsequence
304
Unavoidable Sets via Lyndon Words
306
Synchronising Words
309
SafeOpening Words
311
Superwords of Shortened Permutations
314
Bibliography
318
Index
332
Autorská práva

Další vydání - Zobrazit všechny

Běžně se vyskytující výrazy a sousloví

O autorovi (2021)

Maxime Crochemore is Emeritus Professor of Université Gustave-Eiffel and of King's College London. He holds an Honorary Doctorate from the University of Helsinki. He is the author of more than 200 articles on algorithms on strings and their applications, and co-author of several books on the subject. Thierry Lecroq is Professor in the Department of Computer Science at the University of Rouen Normandy (France). He is currently head of the research team 'Information Processing in Biology and Health' of the 'Laboratory of Computer Science, Information Processing and System'. He has been one of the coordinator of the working group in Stringology of the French CNRS for more than 10 years. Wojciech Rytter is Professor of University of Warsaw. He is the author of a large number of publications on automata, formal languages, parallel algorithms and algorithms on texts. He is a co-author of several books on these subjects, including 'Efficient Parallel Algorithms', 'Text Algorithms' and 'Analysis of algorithms and data structures'. He is a member of Academia Europaea.

Bibliografické údaje