Blocking directed paths on a DAG with a linear number of vertex defects.

Question

Let $G=(V,E)$ be a directed acyclic graph. Define the set of all directed paths in $G$ by $\Gamma$. Given a subset $W\subseteq V$, let $\Gamma_W\subseteq \Gamma$ be the set of all paths $\gamma\in\Gamma$ supported on $V\backslash W$ (i.e all vertices in $\gamma$ belong to $V\backslash W$). Now define $l(W)$ to be: $$l(W)=\max_{\gamma\in \Gamma_W} |\gamma|$$ Where $|\gamma|$ is the number of vertices in $\gamma$.

I want to prove (or disprove) the following claim:

${\bf Claim:}$ For every $\epsilon>0$ and every $k>0$, there are constants $L$ and $N$ such that for any directed acyclic graph $G=(V, E)$ satisfying $|V|>N$ with the sum of incoming and outgoing degrees bounded by $k$, there exists a subset $W\subseteq V$ such that $\frac{|W|}{|V|}<\epsilon$ and $l(W)<L$.

The claim is true for directed trees (see edit 1 for a proof) but the same proof idea fails to work in more general DAGs. Moreover, the statement fails to be true if we remove the constant degree requirement for $G$. Indeed, the maximal topological order on vertices indexed from 1 to n can not be "blocked" for any $\epsilon>0$ by any set $W$ of size linear in $n$.

Any direction or idea would be welcome.

Edit 1:

For trees, a standard proof would go like this: For $0\leq i\leq L-1$, Define $W_L^i$ to be the set of all vertices reachable from the root with a directed path of length $i \pmod L$. Since the graph is a tree, any such path is uniquely defined for every vertex and therefore, for a given $L$, the set $\{W_L^i\}_{0\leq i\leq L-1}$ gives a partition of $V$. Therefore choosing $L=\frac{1}{\epsilon}$, there is some $i_0$ such that $|W_L^{i_0}|$ is at most $\frac{|V|}{L}=\epsilon |V|$. It is left to show that every $W_L^i$ is indeed $L$-blocking, but this is trivial since any step in a directed path down the tree increases the distance from the root by exactly 1, so the longest path containing no vertices from $W_L^i$ has to be of length at most $L-1$ (Connecting 2 adjacent floors in $W_L^i$)

Edit 2:

In general, the claim is true for any DAG for the special case of $\epsilon = \frac{2k}{2k+1}$ and $L=1$. To see that consider the following algorithm:

1- choose a vertex $v$ in the graph that still has neighbours. Keep $v$, and remove all of its neighbours (in both directions) from graph

2 - if any non isolated vertex is left, go back to 1. Otherwise exit.

The resulting graph is completely disconnected ($L=1$) and we removed at most an $\epsilon=\frac{2k}{2k+1}$ fraction of vertices from the graph. The claim follows.

Edit 3:

As Misha Lavrov showed, the previous bound can be made tighter and we can prove the claim for $\epsilon=\frac{k}{k+1}$. I discovered that this bound is not tight when the DAG has total degree bounded by 3. In this case, I will prove the claim for any $\epsilon>\frac{1}{2}$ where the previous bound is only $\epsilon=\frac{2}{3}$. Define the in-degree and out-degree of a vertex $v$ in $G$ by $in(v)$ and $out(v)$ respectively. From the assumption, for all $v \in V$, $in(v)+out(v)\leq 3$. Define 4 sets: $\{V_i\}_{i=0}^3$ by: $$V_i=\{v\in V | in(v)=i\}$$

Obviously, $\{V_i\}_{i=0}^3$ forms a partition of $V$. Therefore, either of the sets $V_1$ or $V_2$ has cardinality at most $\frac{n}{2}$. W.l.o.g, assume it is $V_1$. Let $G'$ be the subgraph of $G$ induced by $V_2$. Obviously, for all $v\in V_2$, $out(v)\leq 1$ and therefore, $G'$ is a disjoint union of directed trees with one vertex sink. Using the proof for trees, for any $\epsilon>0$, we can find a subset $W\subseteq V_2$ such that $\frac{|W|}{|V_2|}<\epsilon$ and $W$ is $\frac{1}{\epsilon}$-blocking in $G'$. Finally, define $W'= V_1 \cup W$. On one hand, $|W'|$ is upper bounded by $(\frac{1}{2}+\epsilon)n$ and on the other hand $W'$ is $(\frac{1}{\epsilon}+2)$-blocking since a directed path in $G$ can either stay in $V_2$ and get blocked by $W$ or get outside of $V_2$ and either be blocked by $V_1$ or do one last step from $V_0$, to $V_3$ or both.

This proves the claim.

PS. Crossposted at MO.

Is $k$ fixed before $L$ is chosen? Of course every graph has bounded degree for some bound. — Mees de Vries, Aug 28 '18 at 09:13
WLOG you can take $k = 2$: of course if you can make a "sequence of unblockable graphs" for $k = 2$ and $\epsilon > 0$ fixed, then you can make one for any $k$ with the same $\epsilon$; if you can make such a sequence for $k > 2$ and $\epsilon > 0$ you can make one for $k = 2$ by adding to each vertex an incoming and outgoing "net", although you need to use $\epsilon' = \epsilon/\text{(size of the net)}$ and $L$ will grow more slowly. — Mees de Vries, Aug 28 '18 at 15:33
What about the weaker (stronger?) related question: fix $k \in \mathbb N$ a maximal degree of nodes in an (undirected) graph. For each $\epsilon > 0$, is there an $L$ such that for each graph $G = (V, E)$ with maximal vertex degree $k$, there is a $W \subseteq V$ with $|W|/|V| < \epsilon$ such that the largest connected component of $G|_{V\setminus W}$ has at most $L$ vertices? — Mees de Vries, Aug 29 '18 at 09:18
I believe this statement is stronger than mine: indeed, you could just take your DAG and make it into an undirected graph. Then if you could completely disconnect it into small regions, then you would immediately get a way to cut long oriented paths into small ones (by a simple counting argument). This stronger statement is actually not true, where a counterexemple is just using a regular graph with good expansion (because then, any region has a large boundary and needs too many "holes" to be disconnected from the rest of the graph) — Yonathan Touati, Aug 29 '18 at 09:34
And it doesn't work to turn these graphs into DAGs (by fixing an arbitrary ordering on the vertices)? — Mees de Vries, Aug 29 '18 at 11:48
Well you could. But the resulting DAG could still be "breackable" in the weaker sense. you can pick holes such that every oriented path is decomposed into pieces of constant size but the holes don't yield a decomposition of the complex into small connected components. — Yonathan Touati, Aug 29 '18 at 12:13
Edited the question to include a short proof of the claim for directed trees. — Yonathan Touati, Aug 31 '18 at 13:39
The proof seems to show $\epsilon = \frac{2k}{2k+1}$. We can get $\epsilon = \frac{k}{k+1}$ to work by the local lemma. To do this, color the graph with $k+1$ colors, and use the local lemma to guarantee a coloring where no path of length $L$ is monochromatic. (The hypotheses of the lemma are satisfied for large enough $L$, since each path intersects at most $L^2 k^{L-1}$ others, but there is a $\frac{1}{(k+1)^{L-1}}$ chance of being monochromatic.) Then delete all but the most popular color. — Misha Lavrov, Sep 02 '18 at 15:30
I don't understand how this works: why is the number of length L paths intersecting a given one $L^2k^{L-1}$? I don't see why we use $L^2$ instead of just L. It also looks like the larger the L, the harder it is to fulfill the lemma's hypotheses. Finally after deleting all but the most popular color, we are left with paths of length 1 at most so the L you are using is not the same as mine. — Yonathan Touati, Sep 03 '18 at 08:09
$L^2 k^{L-1}$ because we get a factor of $L$ from choosing a point on the given path, and another from choosing how far that point is along the other path. We have $\frac{e L^2 k^{L-1}}{(k+1)^{L-1}}<1$ when $L$ is large. I'm not requiring a proper coloring; there could be paths of length $L-1$ that are monochromatic (and paths of length $L-1$ in the most popular color, as a result), just not paths of length $L$. — Misha Lavrov, Sep 03 '18 at 14:45
"Proving the claim for any constant ϵ would already be significant progress." It would be actually the end of the story (if $\varepsilon$ does not depend on $k$, of course) because we can bootstrap. On the other hand I'm 99% sure that there are counterexamples though I couldn't find one yet. — fedja, Oct 06 '18 at 03:48
Edited the question to provide a case where the previous bound of @Misha Lavrov isn't tight. — Yonathan Touati, Oct 08 '18 at 09:26
@Yonathan: At this point, we expect each edit to be substantial. This is not a fresh question anymore. This means that there is no reason to make a lot of small edits. When you decide you want to make an edit to the question, make one at the comfort of your own computer, triple check it for typos and editorial changes, and only then post it. You may also make use of the [meta-tag:sandbox] posts on [meta] which lets you make the edits without bumping the question on the main site. — Asaf Karagila, Oct 09 '18 at 11:21
Ok sorry. I wasn't aware each single Edit was bumping the question up... — Yonathan Touati, Oct 09 '18 at 11:23
Why do you start from $k = 3$? Yes, everything is obvious for $k \le 1$ even for $\epsilon = 0$. But $k = 2$ is also a good case. Consider vertices $U = {,v \mid \deg_+ v = \deg_- v = 1,}$. Set $U$ induces number of paths and cycles, that are easy to decompose into paths of length $\left\lfloor \frac 2\epsilon \right\rfloor$ by deletion of $\epsilon |U| \le \epsilon |V|$ vertices. Taking into account vertices from $V \setminus U$ we get $L = \left\lfloor \frac 2\epsilon \right\rfloor + 2$. — Smylic, Jan 19 '21 at 16:49
@MishaLavrov Counterexamples would be exciting too Your wish is my command. :) See my answer stub... — fedja, Jan 20 '21 at 02:01

score 4 · Answer 1 · answered Jan 20 '21 at 02:00

4

I know it is a bad style but I have no time for proper typing now (maybe I'll do it later), so here is a set of handwritten notes with a counterexample. Questions are welcome, as usual :)

answered Jan 20 '21 at 02:00

fedja

19,348

I am tentatively excited but hoping to see a typed-up version at some point. – Misha Lavrov Jan 20 '21 at 22:10
@MishaLavrov Yeah, at some point... However, I hope my handwriting is legible enough :) – fedja Jan 21 '21 at 00:04

score 3 · Answer 2 · answered Jul 04 '21 at 08:59

The following is a writeup of Fedja's solution (with some details added).

$\Large{\text{The Problem}}$

For every $\epsilon > 0$ and $d \ge 1$, are there $L,N$ so that for any directed acyclic graph with at least $N$ vertices and max-degree at most $d$, one can remove at most $\epsilon N$ vertices to destroy all paths of length $L$?

Here, "max-degree" is the maximum over all the out-degrees and the in-degrees of the vertices.

$\Large{\text{The Counterexample}}$

Let $k = 10^{100}$. We disprove the problem for $\epsilon := 10^{-5}$ and $d := 2k/\epsilon = 2\cdot 10^{105}$.

Take $L \ge 1$. We shall find arbitrarily large directed acyclic graphs with max degree at most $k$ so that removing any $\epsilon N$ vertices does not destroy all paths of length $L$.

Take $N$ a large power of $2$. Put a random graph on vertices $1,2,\dots,N$ by including the directed edge $u \to v$, for $u < v$, with probability $\frac{\alpha}{v-u}$, where $\alpha$ satisfies $\sum_{v=2}^N \frac{\alpha}{v-1} = k$. Note $\alpha \approx \frac{k}{\log N}$ (so indeed $\frac{\alpha}{v-u} \in [0,1]$). Call the formed random graph $\mathcal{G}_N$.

Note the expected number of edges in the graph is $\sum_{u=1}^N \sum_{v = u+1}^N \frac{\alpha}{v-u} \approx \alpha N\log(N) \approx kN$. So with probability at least $\frac{1}{3}$, say, the number of edges in our random graph is at most $2kN$. Assume that our random graph has at most $2kN$ edges. Remove all vertices whose in-degree is at least $\frac{k}{\epsilon}$, and remove all vertices whose out-degree is at least $\frac{k}{\epsilon}$. We removed at most $4\epsilon N$ vertices.

The remaining (random) graph, denoted $\overline{\mathcal{G}}_N$, is such that all vertices have in-degree plus out-degree upper bound by $\frac{2k}{\epsilon} = d$. We will prove that, with probability $1-o_{N \to \infty}(1)$ (conditioned on the at-least-$1/3$ probability event of the initial graph having at most $2kN$ edges) we cannot remove $\epsilon N$ vertices to block all paths of length $L$.

$\Large{\text{Proof that the Counterexample Works}}$

For the proof that the counterexample works, we may keep the vertices with in-degree or out-degree at least $\frac{k}{\epsilon}$ (call them for a brief moment $\textit{popular}$ vertices). Indeed, if we can block all paths of length $L$ by removing $\epsilon N$ vertices in the graph with the popular vertices removed, then we can block all paths of length $L$ by removing $5\epsilon N$ vertices in the graph with the popular vertices not removed. Let $c = 10^{-3}$. Note $c > 30\epsilon$.

$\textbf{Definition}$: Call a graph \textit{$L$-tame} if its vertices can be partitioned into $L$ sets $V_0,\dots,V_{L-1}$ such that (1) the number of edges $u \to v$ with $u \in V_i, v \in V_j$ for $i \le j$ is at most $ckN$ and (2) for every $1 \le i \le L-1$ and every $u \in V_i$, there exists an edge $u \to v$ with $v \in V_{i-1}$.

We prove the following lemma in the next section.

$\textbf{Main Lemma}$: For every $L \in \mathbb{N}$, $\Pr[\mathcal{G}_N$ is $L$-tame$] \to 0$ as $N \to \infty$.

Assuming the Main Lemma, we now go on to prove that the counterexample works (with high probability). We begin with a lemma.

$\textbf{Lemma 1}$: In $\mathcal{G}_N$, the probability that any fixed $\epsilon N$ vertices have more than $ckN$ outgoing edges is at most $e^{3k\epsilon N}e^{-ckN}$.

Proof: By Markov's inequality, letting $E_o$ denote the number of outgoing edges from our fixed $\epsilon N$ vertices, it suffices to show that $\mathbb{E}[e^{E_0}] \le e^{3k \epsilon N}$. And this follows from the fact that $1$ has the most out edges in expectation: \begin{align*} \mathbb{E}[e^{E_0}] &\le \left(\prod_{v=2}^N (1-\frac{\alpha}{v-1}+\frac{\alpha}{v-1}e)\right)^{\epsilon N} \\ &\le e^{\left(\sum_{v=2}^N \frac{2\alpha}{v-1}\right)\epsilon N} \\ &\le e^{3k\epsilon N}.\end{align*} $\square$

$\textbf{Lemma 2}$: The probability that $\mathcal{G}_N$ has $5\epsilon N$ vertices so that removing them destroys all paths of length $L$ is at most $o(1)$ as $N \to \infty$.

Proof: Suppose that we can block $5\epsilon N$ vertices so that the longest path is of length at most $L-1$. By Lemma 1, the probability that there are $5\epsilon N$ vertices that have more than $ckN$ outgoing edges is at most ${N \choose 5\epsilon N} e^{15k\epsilon N}e^{-ckN} \le e^Ne^{15k\epsilon N}e^{-ckN} \le e^{-\frac{1}{2}ckN}$, which is $o(1)$, so we may suppose there are not $5\epsilon N$ such vertices. We then argue that our graph is $L$-tame, which will finish the proof of Lemma 2, in light of the Main Lemma.

Define $V_0$ to be the blocked vertices and the vertices of outdegree $0$, and, for $1 \le i \le L-1$, $V_i$ to be the vertices $v$ such that the longest unblocked path starting at $v$ has length $i$ (i.e., the number of edges in the longest path is $i$). Note that since the graph is acyclic, there can be no edges from $V_i$ to $V_j$ if $1 \le i \le j$. Therefore, the only edges from $V_i$ to $V_j$ with $i \le j$ are out-edges from the blocked points. Hence, property (1) of an $L$-tame graph is satisfied. And property (2) is satisfied by the definition of the $V_i$'s. $\square$

The proof that the counterexample works now immediately follows, since conditioning on an event that has probability at least $1/3$ still makes it so that the probability there are $5\epsilon N$ vertices that can be removed to destroy all paths of length $L$ is $o(1)$ as $N \to \infty$.

$\Large{\text{Proof of Main Lemma}}$

We use the crude union bound $$\Pr[\mathcal{G}_N \text{ is } \text{$L$-tame}] \le \sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfies (1),(2)}].$$ For fixed $V_0,\dots,V_{L-1}$, the conditions (1) and (2) in the definition of $L$-tame refer to disjoint sets of edges, so by independence, we have $$\Pr[V_0,\dots,V_{L-1} \text{ works for } \mathcal{G}_N] = \Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}]\times \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}].$$

In the next section, we show $$\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}] \le e^{-kN/160}$$ for each (fixed) partition $V_0,\dots,V_{L-1}$ of $[N]$. In the section after, we show that $$\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le (N+1)^L e^{2N} k^N.$$ Combining the results of the two sections gives that $$\Pr[\mathcal{G}_N \text{ is } \text{$L$-tame}] \le e^{-kN/160}(N+1)^L e^{2N}k^N,$$ which clearly goes to $0$ as $N \to \infty$.

$\Large{\text{Bounding $\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}]$}}$

In this section, fix a partition $[N] = V_0\sqcup\dots\sqcup V_{L-1}$. We shall show $$\Pr[V_0,\dots,V_{L-1} \text{ satisfy (1)}] \le e^{-kN/160}.$$ To this end, define an edge $\vec{e}$ to be \textit{bad} if it connects $u \to v$ with $u \in V_i$ and $v \in V_j$ where $i \le j$. Let $p_{\vec{e}} = \frac{\alpha}{v-u}$ be the probability of inclusion of the edge $\vec{e}$.

\vs

$\textbf{Lemma 3}$: It holds that $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log N + 3\sum_{i=0}^{L-1} P_i\log P_i\right]\alpha N,$$ where $P_i := \frac{|V_i|}{N}$.

Proof: We induct on $N$ (for $N$ a power of $2$). For $N=1$, the inequality becomes $0 \ge 0$. Now suppose the inequality holds for $N$ (for any partition), and let's work with a partition of $[2N]$. Let $$P_i^- = \frac{|V_i\cap[1,N]|}{N}$$ $$P_i^+ = \frac{|V_i\cap[N+1,2N]|}{N},$$ so that, by the induction hypothesis, $$\sum_{\substack{\vec{e} \text{ bad} \\ \vec{e} \text{ internal}}} p_{\vec{e}} \ge \left[\frac{2}{16}\log N+3\sum_{i=0}^{L-1}\left(P_i^-\log P_i^- + P_i^+\log P_i^+\right)\right]\alpha N,$$ where an edge $\vec{e}$ is \textit{internal} if it lies in $[1,N]$ or in $[N+1,2N]$. Above, we just critically used the fact that $\frac{1}{j-i}$ weights all scales the same. Now, using the trivial $\frac{\alpha}{j-i} \ge \frac{\alpha}{2N}$, we have \begin{align*} \sum_{\substack{\vec{e} \text{ bad} \\ \vec{e} \text{ external}}} p_{\vec{e}} &\ge \frac{\alpha}{2N}\sum_{i \le j} |V_i\cap [1,N]|\cdot |V_j\cap [N+1,2N]| \\ &= \frac{1}{2}\alpha N\sum_{i \le j} P_i^-P_j^+.\end{align*} Thus, $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log N + \frac{3}{2}\sum_{i=0}^{L-1}\left(P_i^-\log P_i^- + P_i^+\log P_i^+\right)+\frac{1}{4}\sum_{i \le j} P_i^-P_j^+\right]\alpha(2N),$$ and we wish to obtain $$\sum_{\vec{e} \text{ bad}} p_{\vec{e}} \ge \left[\frac{1}{16}\log(2N)+3\sum_{i=0}^{L-1} P_i\log P_i\right]\alpha(2N).$$ So, multiplying through by $2$, we just need to show $$3\sum_{i=0}^{L-1}\left(P_i^-\log P_i^-+P_i^+\log P_i^+-2 P_i\log P_i\right)+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{8}\log 2.$$ Note, for $x \in [-1,1]$, $(1+x)\log(1+x)+(1-x)\log(1-x) \ge x^2$ (Taylor expand), so, for any $x,y \ge 0$, we have \begin{align*} x\log x+y\log y-2(\frac{x+y}{2})\log(\frac{x+y}{2}) &= \frac{x+y}{2}\left[\frac{2x}{x+y}\log(\frac{2x}{x+y})+\frac{2y}{x+y}\log(\frac{2y}{x+y})\right] \\ &\ge \frac{x+y}{2}\left(\frac{x-y}{x+y}\right)^2 \\ &= \frac{(x-y)^2}{2(x+y)}.\end{align*} We thus obtain $$P_i^-\log P_i^-+P_i^+\log P_i^+ - 2P\log P \ge \frac{(P^--P^+)^2}{4P},$$ so it suffices to show $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i}+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{8}\log 2.$$ Let $$I_1 = \{i : P_i^+ \le \frac{1}{2}P_i^-\}$$ $$I_2 = \{i : P_i^+ > \frac{1}{2} P_i^-\}.$$ For $i \in I_1$, we have $$P_i^--P_i^+ \ge \frac{1}{2} P_i^-$$ and $$P_i = \frac{P_i^-+P_i^+}{2} \le \frac{3}{4}P_i^-,$$ so $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i} \ge 3\sum_{i \in I_1} \frac{\frac{1}{4}(P_i^-)^2}{3P_i^-} = \frac{1}{4}\sum_{i \in I_1} P_i^-.$$ Also, $$\frac{1}{2}\sum_{i \le j} P_i^- P_j^+ \ge \frac{1}{2}\sum_{\substack{i \le j \\ i,j \in I_2}} P_i^-P_j^+ \ge \frac{1}{4}\sum_{\substack{i \le j \\ i,j \in I_2}} P_i^-P_j^- \ge \frac{1}{8}\left(\sum_{i \in I_2} P_i^-\right)^2.$$ Letting $x = \sum_{i \in I_1} P_i^-$ and noting $\sum_{i \in I_2} P_i^- = 1-x$, we obtain $$3\sum_i \frac{(P_i^--P_i^+)^2}{4P_i}+\frac{1}{2}\sum_{i \le j} P_i^-P_j^+ \ge \frac{1}{4}x+\frac{1}{8}(1-x)^2 = \frac{1}{8}(1+x^2) \ge \frac{1}{8} \ge \frac{\log 2}{8}.$$ We are done. $\square$

$\textbf{Corollary}$: The probability that there are at most $ckN$ bad edges is at most $e^{-\frac{kN}{160}}$.

Proof: Let $\mathcal{P} = \sum_{\vec{e} \text{ bad}} p_{\vec{e}}$ denote the expected number of bad edges. By Lemma 3, we have $$\mathcal{P} \ge \left[\frac{1}{16}\log N-3\log L\right]\alpha N,$$ so for $N$ large enough, we have $$\mathcal{P} \ge \frac{1}{20}(\log N)\alpha N = \frac{kN}{20}.$$ Now, let $|E_b|$ denote the (random) number of bad edges. For $\tau > 0$, using independence, we have $$\mathbb{E} e^{\tau(\mathcal{P}-|E_b|)} = \prod_{\vec{e} \text{ bad}} \left[(1-p_{\vec{e}})e^{p_{\vec{e}}\tau}+p_{\vec{e}}e^{(p_{\vec{e}}-1)\tau}\right].$$

We claim that $f(\tau) := (1-p)e^{p\tau}+pe^{(p-1)\tau} \le e^{\frac{p}{2}\tau^2}$ for all $\tau \in [0,1]$, for any $p \in [0,1]$. Indeed, \begin{align*} f(\tau) &\le 1+\frac{1}{2}\left(\max_{\xi \in [0,\tau]} f''(\xi)\right)\tau^2 \\ &= 1+\frac{1}{2}\max_{\xi \in [0,\tau]} \left[(1-p)p^2e^{p\xi}+p(1-p)^2e^{(p-1)\xi}\right]\tau^2 \\ &\le 1+\frac{1}{2}\left(\max_{\xi \in [0,\tau)} p(1-p)e^{p\xi}\right)\tau^2 \\ &\le 1+\frac{p}{2}\tau^2 \\ &\le e^{\frac{p}{2}\tau^2}.\end{align*}

Therefore, for $\tau \in [0,1]$, we have $$\mathbb{E} e^{\tau(\mathcal{P}-|E_b|)} \le e^{\frac{\mathcal{P}}{2}\tau^2},$$ so $$\Pr\left[|E_b| \le ckN\right] \le \Pr\left[|E_b| \le \frac{\mathcal{P}}{2}\right] \le e^{\frac{\mathcal{P}}{2}(\tau^2-\tau)},$$ and taking $\tau = \frac{1}{2}$ gives an upper bound of $e^{-\frac{\mathcal{P}}{8}} \le e^{-\frac{kN}{160}}$. $\square$

$\Large{\text{Bounding $\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}]$}}$

We wish to show $$\sum_{[N] = V_0\sqcup\dots\sqcup V_{L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le (N+1)^L e^{2N} k^N.$$ Fix some $N_0,\dots, N_{L-1} \in \{0,\dots,N\}$ that add up to $N$. Since there are at most $(N+1)^L$ choices for $(N_0,\dots,N_{L-1})$, it suffices to show $$\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] \le e^{2N} k^N.$$ Note that we may rewrite $$\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}]$$ $$ = \sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}].$$

We write $V_{j+1} \rightrightarrows V_j$ if each vertex in $V_{j+1}$ has an edge to $V_j$, so that the above becomes $$\sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \Pr[V_{j+1} \rightrightarrows V_j \hspace{1.5mm} \hspace{1.5mm} \forall 0 \le j \le L-2],$$ which by repeated application of independence is equal to $$\sum_{V_0 \subseteq [N], |V_0| = N_0} \sum_{\substack{V_1 \subseteq [N], |V_1| = N_1 \\ V_1 \cap V_0 = \emptyset}} \dots \sum_{\substack{V_{L-1} \subseteq [N], |V_{L-1}| = N_{L-1} \\ V_{L-1} \cap V_j = \emptyset \hspace{1.5mm} \forall 0 \le j \le L-2}} \prod_{j=0}^{L-2} \Pr[V_{j+1} \rightrightarrows V_j].$$

$\textbf{Lemma 4}$: Suppose $V_0,\dots,V_j$ are fixed, pairwise-disjoint subsets of $[N]$ with $|V_i| = N_i$ for $0 \le i \le j$. Then $$\sum_{\substack{V_{j+1} \subseteq [N], |V_{j+1}| = N_{j+1} \\ V_{j+1} \cap V_i = \emptyset \hspace{1.5mm} \forall 0 \le i \le j}} \Pr[V_{j+1} \rightrightarrows V_j] \le e^{N_{j+1}}k^{N_{j+1}}\left(\frac{N_j}{N_{j+1}}\right)^{N_{j+1}}.$$

Proof: Note $$\Pr[V_{j+1} \rightrightarrows V_j] = \prod_{u \in V_{j+1}}1\left[u \text{ has an edge to } V_j\right] \le \prod_{u \in V_{j+1}}\left(\sum_{v \in V_j} 1\left[u \text{ has an edge to } v\right]\right).$$ For $u \in V_{j+1}$, let $\sigma(u) = \sum_{v \in V_j} 1[u \text{ has an edge to } v]$. Then, \begin{align*}\sum_{\substack{V_{j+1} \subseteq [N], |V_{j+1}| = N_{j+1} \\ V_{j+1} \cap V_i = \emptyset \hspace{1.5mm} \forall 0 \le i \le j}} \prod_{u \in V_{j+1}} \sigma(u) &\le \frac{1}{N_{j+1}!}\left(\sum_{u \in \{1,\dots,N\}} \sigma(u)\right)^{N_{j+1}} \\ &\le \frac{1}{N_{j+1}!}\left(kN_j\right)^{N_{j+1}}.\end{align*} The Lemma then follows from Stirling's approximation. $\square$

Repeated application of Lemma 4 thus gives \begin{align*}\sum_{\substack{[N] = V_0\sqcup\dots\sqcup V_{L-1} \\ |V_i| = N_i \hspace{1.5mm} \forall 0 \le i \le L-1}} \Pr[V_0,\dots,V_{L-1} \text{ satisfy (2)}] &\le e^{N_1+\dots+N_{L-1}} k^{N_1+\dots+N_{L-1}}\prod_{j=0}^{L-2} \left(\frac{N_j}{N_{j+1}}\right)^{N_{j+1}} \\ &\le e^N k^N \prod_{j=0}^{L-2} e^{N_j} \\ &\le e^{2N}k^N,\end{align*} where we used $N_j/N_{j+1} \le e^{N_j/N_{j+1}}$.

Blocking directed paths on a DAG with a linear number of vertex defects.

2 Answers2