Skip to main content

Factorial string

· 4 min read
DoremySwee

This is the first activity of Computerization algorithm team. We discussed a problem from Shanghai NOI regional selection, which leverages state compression dynamic programming. The mathematics is beyond our capacity to be rigorously proven.

Problem

Problem link

Given a string SS composed of the first nn lower-case letters, SS is a factorial string if all the permutations of the first nn lower-case letters (there are n!n! of them)(共n!n!种)are subsequences (not necessarily consecutive) of SS.

Given this definition, we can easily verify if a sequence is factorial by bashing, but it's too slow. Please design an algorithm that verifies if a string is factorial in 1 second.

Input constraintsMemory limitExecution time
S450\|S\|\le 450, n26n\le 26125MB1.0s

Solution

The bashing solution is obviously enumerating all permutations and checking if they are subsequences. You get 30% of the score from this method, but the program is pretty meaningless, and the factorial time complexity prohibits further optimization, so we don't discuss further on this attempt.

Now consider the rest 70% marks. For n20n\le 20 we may consider an O(2n)\mathcal{O}(2^n) algorithm. Since a full enumeration is not possible, we shall turn to state compression dynamic programming, and enumerate if a letter exists in the state. Since we need to find if all permutations of the first nn letters exist in SS, we will keep track of the earliest occurence of xx in SS.

The state transfer equation:

f(a,b,c,e)=max(first ‘e’ from f(a,b,c),first ‘c’ from f(a,b,e),first ‘b’ from f(a,c,e),first ‘a’ from f(b,c,e)) \begin{aligned} f(a,b,c,e)=\max(&\text{first }\texttt{`e'}\text{ from }f(a,b,c), \\ &\text{first }\texttt{`c'}\text{ from }f(a,b,e), \\ &\text{first }\texttt{`b'}\text{ from }f(a,c,e), \\ &\text{first }\texttt{`a'}\text{ from }f(b,c,e)) \end{aligned}

But if every lookup is done by a loop, we would exceed the time limit and only get 50 points, so we need some preprocessing. We can conduct binary search after preprocessing, but for the sake of convenience, it may be better to maintain another array that marks the position of the next target letter at every position. The time for this is just 450×26450\times 26.

Now, consider the remaining marks. We notice that for large enough nn, we can just output "No". But how do we determine the exact threshold? If the construction is like n=3    S=abcbabn=3\implies S=\mathtt{abcbab}n=4    S=abcdcbabcdcban=4\implies S=\mathtt{abcdcbabcdcba}, the length of SS is about n2n+1n^2-n+1n21n\le 21, but we weren't able to prove it rigorously. To be on the safer side, because the runtime is still acceptable for n=22n=22, we choose to output "No" for n23n\ge 23.

Program

#include <iostream>

using namespace std;

int main(){
int t;
cin >> t;
while (t--) {
int n;
cin >> n;
string s;
cin >> s;
if (n >= 23) {
cout << "NO" << endl;
} else {
short NEXT[26][450];
for (int i = s.length() - 1; i >= 0; i--) {
if (i == s.length() - 1) {
for (int j = 0; j < 26; j++)
NEXT[j][i] = -1;
} else {
for (int j = 0; j < 26; j++)
NEXT[j][i] = NEXT[j][i + 1];
}
NEXT[s[i] - 'a'][i] = i;
}
short *f = new short[1 << n];
for (int i = 0; i < (1 << n); i++) {
f[i] = 0;
for (int j = 0; (1 << j) <= i; j++) {
if (i & (1 << j)) {
if (NEXT[j][f[i - (1 << j)]] == -1) {
cout << "NO" << endl;
return 0;
}
f[i] = max(f[i], NEXT[j][f[i - (1 << j)]]);
}
}
}
cout << "YES" << endl;
}
}
}