----------------------- REVIEW 1 ---------------------
PAPER: 38
TITLE: Asynchronous Non-Bayesian Learning in the Presence of Crash Failures
AUTHORS: Lili Su and Nitin Vaidya
OVERALL EVALUATION: 1 (weak accept)
REVIEWER'S CONFIDENCE: 4 (high)
----------- Review -----------
In this paper, the authors address the Non-Bayesian Learning problem : many agents (or sensors) in a network have to find the true environmental state \theta in a finite set \Theta given private signals and their distribution (for each possible state \theta_i). Agents can share and revise their beliefs with their neighbors during the execution across a network. At the end, the network of agents have to reach a consensus on the true environmental state.
Authors consider the understudied asynchronous case where message between agents take a arbitrary but finite long time and some agents could suffer crash fault. Their contributions are the minimal condition on the network (and agents knowledges) to perform a consensus-based non-Bayesian learning taking into account asynchronous and eventual crash in the global system. Their analysis gives a convergence rate of the learning of the overall system.
The context of asynchronous distributed learning in presence of crash failures is very important for deploying large scale systems. So theoretical results given on asynchronous non-Bayesian learning in this paper are interesting for further works on this subject.
The introduction is clear and contribution is well-defined. They motivate the interest of the subject. Some graphical representations of the problem and graph definitions would be appreciated. There are no experimentation and no clear interpretation of results given.
Authors ensure that results can be generalized to time-varing networks easily but no clue is given.
A possible error in (7), which maybe don’t impact the next sections and proofs, should be corrected or clarified (see below).
Feldman et al. published a paper [1] on non-Bayesian asynchronous learning in social networks. They studied the networks structure to obtain a consensus. Authors should explain how their work is to be compared to Feldman’s paper.
Details:
- Equation (7) should have an error or lack of explanation. u_i^t is define in both (6) and (7) for all i in N^-[t]. Do the authors use (7) in the case where agent don’t receive private signal ? From Belief Update Rule all agents in N^-[t] receive a private signal at each iteration so I don’t understand the interest of (7). Moreover, the equality u_t^i=u_{t-1}^i looks wrong or not clear for me in (7).
- Lack of an intuitive explanation of Phi and the link with the next section : what does it represents ? And why did authors analyze the convergence in (14) ?
- The equation (14) is it hold for all r <=t+1 and all k \in V ?
- Authors propose (with the proof of Theorem 2) an independent result. To highlight this contribution, they should better introduce Wolfowitz and Hajnal works and the delta and nu functions to explain the interest of this result. Without this, Lemma 1 and 2 have no interest in this paper.
- Notations S_H and S_i are a little bit confusing. First is a subset of V and the second is the signal space for agent I.
- Interpretation of results (specially the convergence rate) can be deepened. Conditions 1-2 and 3: are they compatible with a realistic large scale system ? Time convergence rate: is it acceptable for realistic application ? On what important network structure is it depending (size for example?) ?
- References 16 → Shahrampour et al. published their paper in the IEEE Transactions on Automatic Control 2015
[1] Feldman, M., Immorlica, N., Lucier, B., & Weinberg, S. M. (2014). Reaching consensus via non-bayesian asynchronous learning in social networks. arXiv preprint arXiv:1408.5192.
----------------------- REVIEW 2 ---------------------
PAPER: 38
TITLE: Asynchronous Non-Bayesian Learning in the Presence of Crash Failures
AUTHORS: Lili Su and Nitin Vaidya
OVERALL EVALUATION: 2 (accept)
REVIEWER'S CONFIDENCE: 3 (medium)
----------- Review -----------
Title: Asynchronous Non-Bayesian Learning in the Presence of Crash Failures
Authors: Lili Su and Nitin Vaidya
This paper considers the setting of distributed agents, organized in a directed network, that receive local signals drawn from prior distributions (in a Bayesian manner) about some global state. The goal is to learn the true global state which requires the agents to exchange information over the links of the network. Since the standard approach of Bayesian belief updates (essentially, exploiting all the available information) requires significant computation as well as communication resources, non-Bayesian learning rules were proposed by Jadbabaie et al. [3] and studied extensively in a series of papers since then. The current paper studies the ‘log-linear learning rule’ in which each node takes the geometric average of the local Bayesian update and its neighbors’ beliefs. This learning rule has been investigated and analyzed by several papers in the past, but always in the context of a synchronous network and reliable nodes. This paper extends this investigation to as!
ynchronous networks with crash faults. It provides a full characterization of the network topologies that guarantee successful learning and prove that the (distributed) learning rule converges within finite time.
I like the problem studied in this paper and the model in which it is defined. I’m an outsider to this field, so it is difficult for me to compare the technical contribution to the prior art, however it is clear that some significant ideas were required to adapt the techniques to asynchronous networks with crash faults — a setting that was not considered beforehand. Based on that impression, I support accepting this paper for publication in SSS16.
Detailed comments:
- Page 3, line -2: Why do you make this assumption?
- Page 4, “The goal is to have all the non-faulty agents (agents in N) collaboratively learn the environmental state θ^∗.”: By what means? Not clear at this stage.
- Page 4, 1st sentence of the network identifiability paragraph: This condition applies only to the log-linear fusion rule? Explain.
- Page 4, equation (1): What is \theta here? This condition is underspecified.
OK, I see that you explain later on that (1) should hold for all \theta \neq \theta^*, but this quantification should appear next to (1).
- The next line: This is the KL-divergence between \ell_j parameterized by \theta^* and \ell_j parameterized by \theta, right? So it basically means that the distributions differ for at least one agent --- makes more sense to write it like that or at least add this intuition.
- Footnote 2 is in a wrong location.
- Page 5, “without loss of generality, we say that this agent crashes in iteration t+1”: I presume that you want it to crash right at the beginning of iteration t+1, before it sends its belief vector. Elaborate.
- Page 5, “Note that in iteration t, an agent can only receive messages tagged with (asynchronous) iteration index t”: According to your algorithm, they will be tagged with t-1.
- Page 5, par -2, last line: This holds only if i \in \bar{N}[t].
- Page 5, equation (7): The \in should be \notin (doesn't make sense otherwise).
- Page 6, the almost surely notation: In this notation, you mean that t goes to infinity?
- Page 7, equation (14): What is the intuition behind this condition? How does it relate to ergodicity?
- Page 7, “The following condition is both necessary and sufficient for (14) to hold”: Are you going to prove it?
- Page 7: The reduction from the classic consensus to the problem studied in this paper is not so clear, e.g., where do the random signals come from?
- Page 8, Theorem 2: What is \xi?
----------------------- REVIEW 3 ---------------------
PAPER: 38
TITLE: Asynchronous Non-Bayesian Learning in the Presence of Crash Failures
AUTHORS: Lili Su and Nitin Vaidya
OVERALL EVALUATION: 1 (weak accept)
REVIEWER'S CONFIDENCE: 3 (medium)
----------- Review -----------
The authors apply consensus-based non-Bayesian learning to a setting with faulty processes and/or message loss. They also consider message delay but this is indistinguishable from a fault in an asynchronous environment. They show that all non-faulty agents asymptotically agree on the true state almost surely.
As a byproduct of their convergence proofs they obtain a generalized version of the results by Wolfowitz and Hajnal.
In my opinion they enforce a rather strong constraint on the graph, i. e. they require each reduced graph (the graph without faulty nodes) to have a unique source component. This means that the remaining non-faulty nodes (or at least a significant subset of them) is strongly connected. Even though this is a strong constraint, it seems natural in the general case.
The provided proofs seem to be sound, even though I didn't check the appendix thoroughly.
small comment:
I believe there is a small mistake on p. 5, last line. I suppose it should be \forall i \not\in \bar{N}[t] instead of \forall i /in.