...and other) small-world models_免费下载--688IT编程网

Analyzing Kleinberg’s(and other)Small-world Models∗

Chip Martel Computer Science Dept.

University of California

Davis,CA95616 martel@cs.ucdavis.edu

Van Nguyen

Computer Science Dept.

University of California

Davis,CA95616 nguyenvk@cs.ucdavis.edu

ABSTRACT

We analyze the properties of Small-World networks,where links are much more likely to connect“neighbor nodes”than distant nodes.In particular,our analysis provides new re-sults for Kleinberg’s Small-World model and its extensions. Kleinberg adds a number of directed long-range ran

dom links to an n×n lattice network(vertices as nodes of a grid, undirected edges between any two adjacent nodes).Links have a non-uniform distribution that favors arcs to close nodes over more distant ones.He shows that the following phenomenon occurs:between any two nodes a path with ex-pected length O(log2n)can be found using a simple greedy algorithm which has no global knowledge of long-range links. We show that Kleinberg’s analysis is tight:his algorithm achievesθ(log2n)delivery time.Moreover,we show that the expected diameter of the graph isθ(log n),a log n fac-tor smaller.We also extend our results to the general k-dimensional model.Our diameter results extend traditional work on the diameter of random graphs which largely fo-cuses on uniformly distributed arcs.Using a little addi-tional knowledge of the graph,we show that we canﬁnd shorter paths:with expected length O(log3/2n)in the ba-sic2-dimensional model and O(log1+1/k n)in the general k-dimensional model(for k≥1).

Finally,we suggest a general approach to analyzing a broader class of random graphs with non-uniform edge prob-abilities.Thus we show expectedθ(log n)diameter results for higher dimensional grids,as well as settings with less uni-form base structures:where links can be missing,where the probability can vary at diﬀerent nodes,or where grid-related he use of lattice distance)has a weaker role or is dismissed,and constraints(such as the uniformness of degree distribution)are relaxed.

∗This work was supported by NSF grant CCR-85961 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on theﬁrst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior speciﬁc permission and/or a fee.

F.2[Analysis of Algorithms and Problem Complex-ity]:Miscellaneous;

G.2.2[Discrete Mathematics]:Graph Theory—Graph Algorithms,Network Problems;C.2.2[Net-work Protocols]:Routing Protocols

General Terms

Algorithms,Performance,Theory

Keywords

Small-world network,random graphs,routing,diameter 1.INTRODUCTION

Small-world networks(SWN)have been an active and common topic in many disciplines,including the social and natural sciences.These networks possess a striking property, the so called small-world phenomenon,also often spoken of as“six degrees of separation”(between any two people in the United States).Milgram discovered this in his pioneer-ing work in the1960’s[21],and the more recent work by Dodds et al.suggests its still true[10].Since many real net-works exhibit small-world properties,a number of network models have been proposed as a framework to study this phenomenon.Recently,Kleinberg[14],building on the work of Watts and Strogatz[22],proposed a family of SWNs to study another compelling aspect of Milgram’s originalﬁnd-ings:a greedy algorithm using only local information can construct short paths.

As Kleinberg has commented,it is striking that short paths not only exist but can be found with limited knowl-edge of the global network.Algorithmic results in this area improve our understanding of many practical network struc-tures and bring in potential applications related to routing problems in the Internet and peer-to-peer networks.Thus,ﬁnding models with this feature of SWNs,with an emphasis on algorithmic aspects,is well motivated.Kleinberg de-veloped an interesting model of this,and it has generated considerable followup work.

Kleinberg’s basic model uses a two-dimensional grid as a base with long-range random links added be

tween any two nodes u and v with a probability proportional to d−2(u,v), the inverse square of the lattice distance between u and v. In the basic model,from each node there is an undirected local link to each of its four grid neighbors and one directed long-range random link.In this setting Kleinberg shows that a simple greedy algorithm using only local information

ﬁnds routes between any source and destination using only O(log2n)expected links[14].

Kleinberg leaves two important issues open in the analysis of routing in his model.We complete the analysis in this paper and then extend our techniques to a broader range of settings.First we show that the O(log2n)expected time analysis is tight(thus except for pairs which are quite close, Kleinberg’s algorithm uses expectedΘ(log2n)links).

Our second main result shows that the expected diame-ter of this graph isΘ(log n).This extends traditional work on the diameter of random graphs which largely focuses on uniformly distributed arcs[7].This diameter result shows that an algorithm with global knowledge of the random links can improve on Kleinberg’s decentralized algorithm by a log factor.We then give an intermediate algorithm which uses some additional,fairly local,information to improve the ex-pected time to O(log3/2n)in the basic2-dimensional model and O(log1+1/k n)in the k-dimensional model(for k≥1). In addition,we develo

p techniques for analyzing random graphs with non-uniform arc distributions.We are able to characterize properties which lead to small diameter graphs, and use these to prove O(log n)expected bounds on the di-ameter of several settings:k-dimensional versions of Klein-berg’s model;settings where arc probabilities are propor-tional to d−r(u,v)for0≤r<2;also for some non-grid settings and where nodes have diﬀering degree or arc distri-butions.

Since real-world networks often are better modeled by random networks with non-uniform arc distributions,our analysis results may help create and analyze more accurate network models.

The structure of this paper.Section3presents deﬁ-nitions and supporting facts on Kleinberg’s Small-world set-ting.Section4discusses our bound on Kleinberg’s delivery time and introduces our alternative algorithm.Section5de-scribes our diameter results for Kleinberg’s model and more general settings.In Section6we summarize our results and suggest additional open problems.

2.RELATED WORK

There has been considerable work on the small-world phe-nomenon.See[15]for early surveys and[14]for a more recent account on modeling small-world networks.Before Kleinberg’s model,Watts and Strogatz[22]proposed a re-ﬁned model by randomly rewiring the edges of a ring lattice each with a

probability parameter p.Watts and Strogatz observed that for small p the model reﬂects many practi-cal small-world networks with small typical path length and non-negligible clustering coeﬃcient.

Applications have been found using Kleinberg’s small-world model or the ideas behind it,such as decentralized search protocols in peer-to-peer systems[20,25],and gos-sip protocols for spreading information in a communication network[11].See also[13]for a generalization that encom-passes both lattice-based and tree-based(“taxonomic”or “hierarchical”)small-world networks,and[1]for a practical approach to measure the diameter of the World Wide Web using a simulation model based on the power law distribu-tion.

The diameter of random graphs is a classic problem[5,6, 7,8]but most results use uniformly distributed arcs.Bol-lobas and Chung[6],study a graph model very similar to Watts and Strogatz in[22]with the nodes of a cycle(or a“ring”)randomly matched to form additional long-range links.The closest diameter work with non-uniform arc prob-abilities is by Benjamini and Berger who study the diameter of1-D long-range percolation graphs[4]and Coppersmith et al.who extend this to k-D grids[9].As in Kleinberg’s model,a grid with(undirected)local links is augmented by undirected long-range random links whose probability is inversely related to their distance.Both papers prove di-ameter results which show how the diameter changes as the arc probability parameters change.Note that in contrast to Klei

nberg’s model,the added links are undirected,and the out-degree of a node is notﬁxed.Thus the analysis tech-niques here are rather diﬀerent than those to analyze Klein-berg’s and related models.OurΘ(log n)diameter bound improves on the best prior result of O(log n log log2n)due to Lebhar and Schabanel’s work onﬁnding short paths[16]. There have also been several recent papers which analyze greedy routing in other small-world like networks.Barriere et al.[3]give matching upper and lower bounds for greedy routing in a ring network augmented by random links.They also give a nice general framework for analyzing greedy rout-ing,and prove a polylog upper bound on greedy routing for a broad class of what they call“Long-range contact”graphs. Our results have some similarities,though we focus on di-ameter issues and use mostly diﬀerent techniques.Aspnes et al.[2]and Manku and Manku et al.[18,17]look at using small-world type networks for routing in peer-to-peer sys-tems.Manku et alpares deterministic and randomized structures,as well as considering the eﬀect of one-lookahead. Aspnes et al.analyze expected performance with additional links and look at the eﬀect of edge failures.They produce some useful results for doing analysis though their focus is on the one dimensional setting.Recently,Fraigniaud et al.

[12]study greedy routing and also show that with some ex-tra knowledge about the graph it is possible to route using O(log1+1/k n)expected steps in the general k-dimensional model(for k≥1).However,they

achieve this with an oblivious greedy algorithm,while ours requires keeping ad-ditional state information.In addition,they show this path length is the best possible for a range of greedy routing algo-rithms(even ones which have complete global knowledge). They give a general lower-bound on greedy routing which also shows that Kleinberg’s O(log2n)analysis is tight for greedy routing in the basic model.

3.BASIC FACTS AND DEFINITIONS

We now present our notation and basic facts on Klein-berg’s small world setting.Let V denote the set of all nodes, the size of which is n2.We use K(n,p,q)to refer to the class of all random graphs based on Kleinberg’s model:an n×n grid,undirected local links(from a node)to all nodes within distance p,and q long range directed links/node such that the probability of a link from u to v is proportional to d−2(u,v).We also use K∗(n,p,q)for a similar class,de-ﬁned with respect to the lattice distance with wrap-around: d(u,v)=min{|k−i|,n−|k−i|}+min{|l−j|,n−|l−j|}.For simplicity,we usually show our main results for K∗(n,p,q) but most of our work can be easily extended to K(n,p,q). Let B l(u)denote the set of all the nodes within lattice distance l from u;we call this a ball of radius l and center u (actually shaped as a diamond in a two dimensional grid). Deﬁne b l(u)as the number of nodes at distance l from u, des on the‘surface’of B l(u).

First,we consider some basic facts for the simple models with p=q=1,which,we denote by K∗and K.For any two nodes u and v,let p(u,v)denote the probability that there is a random link from u to v.Since p(u,v)∼d−2(u,v),we have p(u,v)=d−2(u,v)/c u,where c u is the inverse normalized coeﬃcient,a constant depending on n and the position of u:

c u=

∀v=u d−2(u,v)=

2n−2

j=1

b j(u)j−2.

Note that we always have p(u,v)=p(v,u)if the graph is from K∗,and so c u is the same for all u;call this value c∗.It is easy to see that b j(u)=θ(j),so c u can be approximated by a harmonic sum.The following fact is trivially implied from Kleinberg’s analysis in[14].

Fact1.For graphs from K∗or K,the inverse normalized coeﬃcient c u=θ(log n)and for any two distinct nodes u and v,p(u,v)=Ω((n2log n)−1).Especially,c∗=4ln n+O(1)1.

3.1Links into or out of a ball

Our analysis of the expected diameter(section5)builds on a standard approach.We consider the expected length of a path from an arbitrary start s to a destination t;for simplicity,we start with K∗(n,1,4)for a basic framework then extend our analysis to K∗(n,1,1)and others.For K∗(n,1,4),using only the long-range links consider all nodes at distance one from s,then all new nodes these can reach (so at distance two),and so on.If these sets grow expo-nentially in size,we will quickly reach a large subset of the nodes in the graph.We prove two important facts which help to analyze the growth rate.

For a given set Q of nodes,we construct a set R,a col-lection of nodes not in Q but reachable by a random link from Q;we also consider a reverse scenario where R con-tains nodes with a random link i nto Q.We focus on lower bounding the ratio|R|/|Q|,which is supported by these two elementary experiments:to see if a random link from the center of a ball goes out of this ball and to see if there is a random link from outside the ball which goes to its center. Note that we have non-uniform probabilities,so the ratio |R|/|Q|depends on the shape of Q as well as its size;we also have directed links,so the analysis on incoming links diﬀers from outgoing links2.

Fact2.On a graph from K∗,given any positiveθ<1, any integer1≤l≤nθ,for n large enough:i)the probability

that a random link from a given node u goes to a node outside of B l(u)is greater than1−θ−o(1);ii)the probability that there is a random link to u from a node outside of B l(u)is greater than1−eθ+o(1)−almost1−eθ−1).

We are most interested in values ofθjust above0.5since then|B l(u)|=Ω(n log n).Thus,a link going out has prob-ability 0.5(1−θ)and in is 0.39(1−eθ−1). Proof.For i)let E be the event that u has a random link going to a node v/∈B l(u).We have

Pr[E]=

v∈B l(u)d−2(u,v)/c∗≤

j=1

b j(u)(j−2)/c∗.

1Since in K∗,b

j (u)=4j if j≤ n/2 and0otherwise.

2In K∗|K,each node has exactly one outgoing random link, but the number of incoming links varies.Since b j(u)≤4j we have

j=1

b j(u)(j−2)≤4

j=1

j−1≤4ln(3l)≤4ln(3nθ).

With c∗=4ln n+O(1)we have

Pr[E]≤

4ln(3nθ)

4ln n+O(1)

≤

θln n+ln3

ln n+O(1)

=θ+o(1)

Thus Pr[E]≥1−θ−o(1).

For ii)let F be the event that there is a random link coming to u from a node v outside B l(u);thus,

Pr[F]=

v/∈B l(u)

(1−p(v,u))≤

v/∈B l(u)

e−p(v,u)=e

−

p(v,u)

v/∈B l(u)

Here we use the well known fact e x≥1+x to obtain e−p(u,v)≥1−p(u,v).Since p(u,v)=p(v,u),

Pr[F]≤e

−

p(u,v)

v/∈B l(u)=e−Pr[E]≤eθ+o(1)−1

So Pr[F]=1−P r[F]≥1−eθ+o(1)−1

In a similar manner,we have an equivalent result for K as follows.See[19]for the proof of this.

Fact3.On a graph from K,given any positiveθ<1and integer1≤l≤nθ,for n large enough:i)the probability that a random link from a given node u goes to a node outside of B l(u)is greater than1−θ

1+3θ

+o(1);ii)the probability that there is a random link to u from a node outside of B l(u)is greater than1−e−(1−θ)/4+o(1).

3.2Extended models

As a natural generalization,we can extend all these con-cepts and basic facts to the classes of graphs based on a k-dimensional grid for k≥1by updating the distribution rule of the random links:p(u,v)∼d−k(u,v)instead.Let K∗(k,n,p,q)and K(k,n,p,q)denote such classes of graphs based on this k-dimensional grid(with size n in each di-mension)with and without wrap-around,respectively.In K(k,n,p,q),for a node u close to the edge of the grid,a ball centered at u may not be a‘full’ball but it is easy to extend all our results with balls to cover such a case.

It is relatively easy to prove that b(j)=θ(j k−1)3.So, b j(u)j−k=θ(j−1),which means that we can still use h

ar-monic sums to bound the likes of

b j(u)j−k as before. Thus,we still have

c u=θ(log n).More importantly,we can generalize the facts on“links into or out of a ball”as follows(for simplicity,we do not claim more exact bounds).

Fact4.On a graph from K(k,n,p,1)or K∗(k,n,p,1), for any given4positiveθ<0.6,there exist positive constants ξ1andξ2such that for n large enough and l≤nθ:

i)the probability that there is a random link from a node u to a node outside of B l(u)is≥ξ1;

ii)the probability that there is a random link(with a given label)to u from a node outside of B l(u)is≥ξ2.

3Based on counting the number of ways to choose k positive numbers such that they sum to a given positive number(j). 4The fact holds for any0<θ<1but we useθ<.6since we only needθabout0.5in our later analysis).

It is also easy to extend this fact for arbitrary q≥ 1. Each node now has q independent random links where each of them can be labeled from1to q.Fact3is then,concerned with a link speciﬁed by a given label and an arbitrary center node u.

3.3Links to a spherical surface

We now study another probabilistic experiment in the general models K(k,n,p,q)and K∗(k,n,p,q),namely,if a random link from a given node u goes to the surface of a given ball B,where u is outside of B.We denote the prob-ability of this by P(u,S B),where S B is the set of nodes on the surface of ball B.Also deﬁne distance d(u,B)between node u and ball B as the minimum lattice distance from u to a node on B’s surface.Our lower bound in section4will be based on this experiment.

Fact5.For any k≥1,there exists a constantˆc such that for any ball B=B l(v)and node u outside B on a graph from K∗|K(k,n,p,q):P(u,S B)≤ˆc

m log n

,where m=d(u,B).

Also,if k≥2and l≤m then P(u,S B)≤ˆc l

Note that for theﬁrst(main)part of the fact,the bound

(ˆc

m log n )depends on the distance from u to the ball only,and

thus,is independent of the size of the ball.Intuitively,if we think of this probability as a measure of‘attractive force’(to u),the force generated by S B is not stronger than the joint force generated by m nodes at about distance m from u.That is,a small fraction of nodes in S B,which are at the ‘pole’closest to u,generates a dominant term for P(u,S B).

A rigorous proof for theﬁrst part is complicated,so we leave the proof to the appendix.Meanwhile,the second

part is easy:P(u,S B)=

w∈S B p(u,w)=O(l

)with

p(u,w)≤1

m k logn

and|S B|=b(l)=O(l k−1)≤l×O(m k−2).

4.DECENTRALIZED ROUTING

Kleinberg proved that the expected number of links used to route from a source s to a destination t(Kleinberg’s de-livery time)by the greedy algorithm is O(log2n)for the two-dimensional case[14]5.We will prove that indeed the greedy algorithm takes expectedΩ(log2n)delivery time for the general k-dimensional model and suggest other variants with better delivery time.Let GA denote the greedy al-gorithm.For a node v we call N(v)the next node found by GA when we are at v(initially,v=s)and seeking a path to the destination node t.Deﬁne random variable δ(v)=d(v,t)−d(N(v),t),which characterizes the speed that GA converges to t.The following lemma,which is ac-tually a re-statement of fact5,establishes a probabilistic bound for this speed.

Lemma6.For each k≥1(the dimension),there ex-ists a constantˆc such that,for a graph from K(k,n,p,q) or K∗(k,n,p,q),any two nodes v and t,and any integer

1<m<d(v,t),Pr[δ(v)=m]≤ˆc

mlogn ×min{1,d(v,t)−m

The following theorem shows that for the majority of s−t pairs,Kleinberg’s delivery time can not be o(log2n),there-fore we have the tight bound ofθ(log2n)for the greedy al-gorithm.

5It is straightforward to generalize his proof for the general k-dimensional model

Theorem7.For any constant c1>0,there exists a con-stant c2>0such that,for any two nodes s and t on a graph

from K∗|K(k,n,1,1)with d(s,t)>c1n,Kleinberg’s delivery dpoints s and t is greater than c2log2n with probability at least0.5,when n is large enough6.

Proof.Consider a series of random trials,where,given a node v,weﬁnd N(v)the node for the next trial.De

ﬁne random variable X v=d(v,t),which reﬂects the ratio between the distance to t before and after such a trial.Let {v1=s,v2,...,v k}be the nodes on a run of GA after k such steps(thus v i+1=N(v i)).

k−1

i=1

X v

reﬂects the ratio between the distance at the initial and at the current state. To make this ratio always well-deﬁned,if N(v)=t we deﬁne X v=d(v,t)and if v=t we deﬁne X v=1.

It is clear that1≤

k−1

i=1

X v

≤d(s,t)and we need to analyze the expected k to have

k−1

i=1

X v

=d(s,t)(indicat-ing that d(v k,t)≤1,so GA is about toﬁnish),or

Z v

= ln(d(s,t))where Z v=lnX v.From lemma6,we can prove that E[Z v]<c/log n for some constant c(proved below),if d(v,t)≥ln n.To get rid of the condition,d(v i,t)>ln n,we modify GA a bit,so whenever we get to a node v within a distance ln n from d(v,t)≤ln n),set N(v)=t and redeﬁne Z v=0(this will not weaken ou

r proof since when we get there we can simply walk to t by at most lnn local links).Thus we always have E[Z v

]≤c/lnn,∀i=1..k. Consider Z k=Z v

+Z v

+...+Z v

.If we reach t with less than k steps,or say,v j=t for some j<k,then as mentioned before,we have Z v

=Z v

j+1

=...=Z v

k−1

=0. We now show that

P r[Z k<M]≥0.5(1) where k= ln2n and M=ln(d(s,t))≥ln n+ln c1−ln(ln n).

Since E[Z v

]<c/ln n,we have

E[Z k]≤k×

ln n

≤

lnn

<0.5M

for n large enough.Thus,P r[Z k≥M]<0.5since otherwise E[Z k]≥0.5M.Hence,P r[Z k<M]≥0.5.From(1), choose c2= 1

then at least half of the time,GA can not ﬁnish with less than c2log2n steps;the theorem follows. Now,we show that there exists a constant c such that E[Z v]≤c/ln n if d=d(v,t)≥ln n.Deﬁne the function f(i)=ln(d

d−i

);note that Z v=ln(d

d−δ(v)

).We need to show that

i=1

f(i)Pr[δ(v)=i]≤c/ln n for some constant c. First we have

d/2

i=1

f(i)Pr[δ(v)=i]≤

d−1

ln n

d−1

i=d/2

1/i≤

c +1

ln n

for some constant c .Note that we have Pr[δ(v)=i]≤c

i ln n

from lemma6and f(i)≤i

(d−i)

from the common

fact that ln(1+x)<x,∀x>−1.Similarly,we have d

connect下载

i=d/2

f(i)Pr[δ(v)=i]≤c

ln n

for some constant c ,us-

ing Pr[δ(v)=i]≤c (d−i)

i2ln n

from lemma6.Now,choose c=c +c +

6Note that we can actually make the probability above an arbitrarily high constant(<1),but we use0.5for simplicity.

4.1Algorithms for improved delivery time

We now consider variants of Kleinberg’s greedy algorithm which use additional knowledge of the graph to improve ex-pected path length.Our basic algorithm operates on graphs from class K ∗(n,1,1)but it can be easily extended to more general classes.We assume that each node u knows the long-range links of the log n neighbor nodes closest to u in the grid.Call W u ,the set of these neighbors and their long range contacts,the view at u .During a basic step of routing,if u is the current message router,we go next to v ,the node in W u closest to t (we may need to follow several local links and,possibly one ﬁnal random link to v ).Once we reach v ,it becomes the current node.

In Kleinberg’s algorithm W u includes only the nodes in-cident to links from u (at most 5in a two-dimensional grid)and a routing decision is made at each node on the path.In our algorithm,once we make a decision for a basic step (at the initial message router)we follow the sub-path to the next node where a decision is made.When we are at an intermediate node in a basic step,we follow local links (up,down,left or right)except for the ﬁnal link which may be a random long-range link).We can describe

this sub-path using only O (log log n )extra bits to specify the last node on the sub-path reached by local links only.Alternately,we can use 2√

log n bits to describe the sequence of local moves (two bits per local move).

We now adapt Kleinberg’s analysis to show that the ex-pected number of links used in this algorithm is O (log 3/2n ).We say that node u is in phase i if 2i <d (u,t )≤2i +1.Kleinberg proved that if u is in phase i ,and v is u s long range contact,the probability that v is in some phase j <i is proportional to 1/log n (for i >log log n ).Let X i be a random variable recording the number of basic steps our algorithm executes from nodes in phase i .

Lemma 8.For i >log log n ,E [X i ]<c for a constant c .Proof.Each basic step considers at least log n/2new long-range links (since at most half of the current message holder’s set W u overlaps all previous sets),and by Klein-berg’s result has at least a constant probability of ﬁnding a link to a node within distance 2i of t .

Theorem 9.Our improved algorithm visits expected O (log 3/2n )nodes in K ∗(n,1,1).

Proof.Using the lemma above,the expected number of basic steps till we get within distance log n is O (log n )(constant per phase for log n phases).Each basic step visits at most √log n nodes

(for a 2-D grid),so the total length is O (log 3/2n ).

More careful work (using tail inequalities)gives a slightly stronger result:a longest path found has length O (log 3/2n )with overwhelming probability.The underlying intuition is to think of doing a chain of basic steps until we have about log n which advance to a new phase.

If we use the same algorithm on classes K ∗/K (n,p,q )where p,q ≥1,it is relatively easy to show a better bound

of O (log 3/2

n p √q ).So,the bound is just O (log n )if for instance,p =Ω(log 1/2n )and q is a constant,or p is a constant and q =Ω(log n ).It is also easy to extend our algorithm to K ∗/K (k,n,p,q ).For p =1,q =1,we obtain the upper bound O (log 1+1/k n ),which is close to the ideal O (log n )when k is large.

A related approach is analyzed in [12]who show that the O (log 1+1/k n )bound for k -dimensions can be achieved by an oblivious algorithm.They also show that this bound is tight for a range of greedy algorithms (even with much more information).

5.EXPECTED DIAMETER 5.1

Basic approach

For simplicity we will ﬁrst consider Kleinberg’s basic 2-D grid,but with four long-range links per node.We show that the expected diameter of a graph from K ∗(n,1,4)is Θ(log n )and then extend our approach to other and more general classes of graphs.

For a source s and a destination t =s chosen arbitrarily from V ,we will show that there is an O (log n )length path from s to t with overwhelming probability.The basic idea is to construct two sets of nodes S and T which are each of size Θ(n log n )such that:all the nodes in S can be reached from s by a directed path of length O (log n )of long-range links;all the nodes in T have a directed path of length O (log n )to t using long-range links.It is then simple to show that with overwhelming probability,there is a directed arc from a node in S to a node in T ,thus there is an O (log n )length path from s to t .The bulk of the proof is showing that appropriate sets S and T exist with high probability.Deﬁne χ(u )with u ∈V as the set of nodes v such that (u,v )is a long-range link;also χ(A )= u ∈A χ(u ),for any A ⊂V .Our basic idea is to ‘grow two trees’with roots from s and t to create S and T .We construct a chain of disjoint

subsets {S k }µ

k =0with S 0=B r (s )where r =r 0√log n for

some constant r 07.Let C k =

k i =0S i and deﬁne S k +1=χ(S k )−C k .Thus,S k +1is built by iteratively applying χon elements of S k and taking only ‘fresh’nodes,which have not been in any preceding subsets.

Later we construct a similar subset chain {T k }νk =0to cre-ate the t −tree.The probability of success (ﬁnding sets S,T which connect)can be made as close to one as needed by choosing r 0(and thus S 0)suﬃciently large.Thus we get an O (log n )bound on the expected diameter of a graph in K ∗(n,1,4).

A worst-case analysis with a ball experiment .

We will show that the subset chain almost surely grows exponentially in size:there is a constant γ>1such that Pr[|S k +1|/|S k |>γ]is almost one when c log n <|S k |<αn log n for suitable constants c and α.Our goal is to ﬁnally form a set S µsuch that |S µ|≥αn log n and µ=O (log n ).For a ﬁxed γ,consider the following S-construction process :starting with S 0we successively create S 1,S 2,...until either we get a set S µwith |S µ|≥αn log n (we succeed)or we have some |S k +1|/|S k |≤γand we fail.As soon as either case happens we stop the process.Note that if we succeed we do so in O (log n )

χ(u )steps,so µ=O (log n )and S µis our desired set S .The following analysis shows that our S -construction succeeds with high probability.

The main concern is how many fresh nodes we get from χ(S k )to include in S k +1.We ﬁx an order to scan S k and for each u ∈S k ,a node v ∈χ(u )is fresh if it has not been included in a subset G ,which contains the current tree (the union of the current S k +1and all the preceding S i ,i ≤k ).7

Thus,there are θ(log n )nodes in S 0all reachable from s by O (√log n )local links.

To analyze this we make the following crucial observation. We consider this experiment E(u,G):we generate a random link8from u and consider if u/∈G;let X(u,G)denote the indicator random variable of this happening.We do this ex-periment4times for each u∈S k and|S k+1|will be the sum of these4|S k|random variables X(u,G);note that these variables are not identical since G keeps growing larger.Let G k denote the whole process.Note that Pr[X(u,G)=1] depends on both the size and shape of G.Thus we ask:if we move the elements in G around,how would we minimize Pr[X(u,G)=1]?The nature of the inverse second power distribution makes it clear:this can be done by moving ele-ments of G closer to u;in fact,Pr[X(u,G)=1]is minimized (for aﬁxed size for G)when G is like a ball with u as its center.

Thus we can lower bound|S k+1|/|S k|using the follow-ing worst-case setting for selecting a link from u.

Let C=Θ(n log n)be the maximum size set C j we can ever have in our S-construction process.Let H be a ball with cen-ter s and size C(or the next larger size to make a full ball).Since H is chosen such that|H|≥|C k+1|(i.e.always ≥|G|)and has the worst possible shape,Pr[X(u,G)=1]≥Pr[X(s,H)=1]for any pair u,G used in our S-construction process.Since by fact2,E[X(s,H)]>βfor anyﬁxed β<0.5when n is large enough,we conclude:

Fact10.For anyﬁxedβ<0.5,E[X(u,G)]>βfor each u,G pair in our S-construction.

We now show that each new set is successfully constructed with high probability:

Lemma11.For anyη>0,andγ∈(1,2)we can choose a constantˆc such that,for k≥0,if|S k|≥ˆc ln n and |C k+1|=O(n log n)then Pr[|S k+1|/|S k|>γ]=1−O(n−η) for n large enough.

Proof.Let m=|S k|,so|S k+1|is the sum of4m X(u,G) indicator R.V’s.By fact10,we treat this as the sum of4m independent Bernoulli random variables each with expecta-tion≥β.Applying Chernoﬀ’s inequality,for0<δ<1,

Pr[|S k+1|≤(4mβ)(1−δ)]≤e−2mβδ2≤n−2ˆcβδ2≤n−ηfor a proper choice ofˆc.Thus,Pr[|S k+1|/|S k|>γ]=1−O(n−η)forγ=4β(1−δ).

Indeed we can always chooseδ>0small enough so that γ=4β(1−δ)can be arbitrarily close to4β,and thus,ar-bitrarily close to2asβ=0.5−o(1).Once a value from (1,2)is chosen forγ(and so0<δ<1and0<β<0.5 are chosen accordingly),given anyη>0,we can chooseˆc such that when m≥ˆc log n we have Pr[|S k+1|/|S k|>γ]= 1−O(n−η

In fact,by some simple computation,we can see thatˆc>η/(1−γ)2is a suﬃcient condition toﬁnd suchˆc.So,if we haveη=6(9),γ=1.8(byβ=0.48andδ=0.0625for example)then we can chooseˆc=601.

Thus,givenﬁxed constantsηandγwe can chooseˆc as above,so if|S0|≥ˆc log n,we can expect the cardi-nality series{|S i|}to grow exponentially before reaching 8Using inverse second power distribution

9Later,in ourﬁnal step,we use this value to prove the two trees,s-tree and t-tree,connect with overwhelming proba-bility

or exceeding the thresholdαnlogn during theﬁrst g+1 terms,where g= logγ(αn/ˆc) =θ(log n).By iterating lemma11appropriately(in each step of growing S i+1from S i when|S i|<αnlogn),we can show that almo

st surely |Sµ|≥αnlogn for someµ≤g=θ(log n).Note that Sµcontains only‘fresh’nodes such that we have not yet con-sidered their random links.We now prove that we almost surely get a large s-tree within O(log n)steps.

Lemma12.For any given node s∈V,anyθ>0and anyα>0,by constructing s’s subset chain as above,we will obtain subset Sµwhere any node in Sµcan be reached from s by an O(log n)path and Pr[|Sµ|≥αnlogn]=1−O(n−θ).

Proof.Consider a successful S-construction where we have a succession of steps where|S k+1|/|S k|>γuntil we reach sizeαn log n.From lemma11we can pickη>θso that Pr[|S k+1|/|S k|>γ]>1−O(n−η)for each step.

The probability that the S-construction process succeeds is greater than the probability we get g consecutive success-ful growth steps with g= logγ(αn/ˆc) =θ(log n)Thus Pr[S-construction process succeeds]≥

1−c1n−η

≥1−gc1n−ηfor some constant c1>0(using a basic calculus fact: (1+x)n≥1+nx for any x>−1and n≥1).Sinceθ<ηand,g=O(log n),it is easy to see that gc1n−η=O(n−θ) and hence,we create the desiredﬁnal set with probability ≥1−O(n−θ).

Another ball experiment to analyze‘t-tree’.

We now consider a tree of nodes with paths to t.We now use a functionˆχwhich,given an input node u,outputs the nodes with a random link to u.As before,we construct a subset chain{T k}ν0by having T k+1=ˆχ(T k)−ˆC k where ˆC

i=0

T i.Thus,we include into T k+1all the fresh nodes which have a random link to any node in T k.We can use much the same approach as before,but with some modiﬁcations and additional details.We still use

a state-variable G to denote the set of all nodes in the tree we have reached so far:G=ˆC k if we have justﬁnished theﬁrst k subsets,otherwise G is the union ofˆC k and the developing T k+1.Let G=V−G,the set of nodes not in the tree yet. Also,for each random link from each node u∈V we assign a label i=1..4.

We now look closer at processˆG k,the growing step of T k+1from T k,which is more complicated than the coun-terpart in the s-tree.LetˆE(u,i,G)denote an experiment which has each node w∈G look at its random link labeled i,and if this link hits u we add w to G(for simplicity we may also useˆE u instead).ProcessˆG k simply repeats the ˆE(u,i,G)experiment10for each node u∈T

and for each i=1..4.As with the s−tree,we now consider an artiﬁ-cial experiment which should be‘outperformed’by the real ˆG

but is easier to analyze.Let H be a ball centered at u with size at leastˆC,the maximum size we allow a set to get in our T-construction process.LetˆX(u,H)be a ran-dom variable which is one if at least one ra

ndom link from a node in V−H hits u,otherwise it is zero.In our T-construction process,letˆX(u,i,G)be an indicator random variable recording whether some node in¯G has a link with label i to u.We now argue that P r[ˆX(u,i,G)=1]>.39 (the bound on a link into a ball in Fact2).

10Note that G is changing as we discover new fresh nodes and we canﬁx the order to scan T k initially.

688IT编程网

...and other) small-world models_免费下载

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

...and other) small-world models_免费下载

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

java正则表达式选择题

非零金额正则表达式

半小时正则表达式