Many-server scaling of the N-system under FCFS–ALIS

Queueing Systems, Oct 2017

The N-system with independent Poisson arrivals and exponential server-dependent service times under the first come first served and assign to the longest idle server policy has an explicit steady-state distribution. We scale the arrival rate and the number of servers simultaneously, and obtain the fluid and central limit approximation for the steady state. This is the first step toward exploring the many-server scaling limit behavior of general parallel service systems.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs11134-017-9549-7.pdf

Many-server scaling of the N-system under FCFS–ALIS

Queueing Syst Many-server scaling of the N-system under FCFS-ALIS Dongyuan Zhan 0 1 2 Gideon Weiss 0 1 2 B Dongyuan Zhan 0 1 2 0 Mathematics Subject Classification 60K25 1 Department of Statistics, The University of Haifa , 31905 Mount Carmel , Israel 2 School of Management, University College London , 1 Canada Square, London E14 5AA , UK The N-system with independent Poisson arrivals and exponential serverdependent service times under the first come first served and assign to the longest idle server policy has an explicit steady-state distribution. We scale the arrival rate and the number of servers simultaneously, and obtain the fluid and central limit approximation for the steady state. This is the first step toward exploring the many-server scaling limit behavior of general parallel service systems. N-system; Many-server scaling; Fluid limits; Central limits; First come first served; Assign to the longest idle server 1 Introduction In this paper we study the many-server N-system shown in Fig. 1, with Poisson arrivals and exponential service times, under the first come first served and assign to the longest idle server policy (FCFS?ALIS), as the number of servers becomes large. Before describing the model in detail, we will first discuss our motivation for studying this system. Research supported in part by Israel Science Foundation Grants 711/09 and 286/13. The N-system is one of the simplest special cases of skill-based routing in parallel server systems, as defined in [9,15] and further studied in [4,6,7,12?14,17,19,20, 22,23]. The general model has customers of types i = 1, . . . , I , servers of types j = 1, . . . , J , and a bipartite compatibility graph G, where (i, j ) ? G if customer type i can be served by server type j . Arrivals are renewal with rate ?, where successive customer types are i.i.d. with probabilities ?i . There are a total of n servers, of which n? j are of type j , and service times are generally distributed with rates ?i, j . Assume the system is operated under the FCFS?ALIS policy, that is, servers take on the longest waiting compatible customer, and arriving customers are assigned to the longest idle compatible server. For this general system, necessary and sufficient conditions for stability (positive Harris recurrence for given ?), or for complete resource pooling (there exists a critical ?0 such that the system is stable for ? < ?0, and the queues of all customer types diverge for ? > ?0) cannot be determined by the first moment information alone (as conjectured by an example of Foss and Chernova [9], which is further discussed in [16]). In particular, under FCFS?ALIS, calculation of the matching rates ri, j , which are the long-term average fractions of services performed by servers of type j on customers of type i , in general, is intractable. In the special case that service rates depend only on the server type, and not on the customer type, with Poisson arrivals and exponential service times, the system has a product form stationary distribution, as given in [2]. In that case matching rates can be computed from the stationary distribution. The following conjecture was made in [4]. If the system is stable and has complete resource pooling for given ?, n, and we let both become large together, the behavior of the system simplifies: there will exist ? j such that servers of type j perform a fraction ? j of the services, and the matching rates ri, j will converge to the rates for the FCFS infinite matching model with G, ?, ?, as calculated in [1] (see also [5]). The conjecture is based on the following heuristic argument: in steady state the times that each server becomes available form a stationary process which is only mildly correlated with the other servers, and so servers become available approximately as a superposition of almost independent stationary processes, which in the many-server limit becomes a Poisson process, and server types are then i.i.d. with probabilities ? j , while customer types arrive as an i.i.d. sequence with probabilities ?i . This corresponds exactly to the model of FCFS infinite matching. Under FCFS?ALIS it is also possible that while the system is stable, service by all the servers is not pooled. Instead it is decoupled: the bipartite compatibility graph breaks into two or more subgraphs, and when the system is operated under FCFS?ALIS the links connecting the subgraphs are only rarely used. The conjecture then is that under many-server scaling this decoupling is the same as in the FCFS infinite matching model, with the same matching rates. In our current study of the many-server N-system we shall verify the conjectured many-server behavior for this simple parallel server system. To do so we start from the known stationary distribution of the N-system with many servers, as derived in [2], and study its behavior as n ? ?. As it turns out, the product form stationary distribution, even for this simple case, is far from simple, and the derivations of limits, which use summations over server permutations and asymptotic expansions of various expressions, are quite laborious. We feel that this emphasizes the difficulty in verifying the conjectured behavior of the general system, which remains intractable at this time. We mention that the N-system with just two servers has been the subject of several papers, including [3,10,11,19,20]. In this paper, our focus is on the N-system with many servers under FCFS?ALIS and its limiting behavior. The rest of the paper is structured as follows. In Sect. 2 we describe the model, and in Sect. 3 we use some heuristic arguments to obtain a guess at the limiting behavior, where we distinguish between pooled and decoupled modes. In Sect. 4 we verify the heuristic guess and obtain the stationary behavior under many-server scaling. In Sect. 5 we illustrate our results with some numerical examples. To improve the readability of the paper we have put all the proofs for Sect. 4 in the Appendix. 2 The model In our N-system, customers of types c1 and c2 arrive as independent Poisson streams, with rates ?1, ?2. There are skill-based parallel servers, n1 servers of type s1 which are flexible and can serve both types, and n2 servers of type s2 which can only serve type c1 customers. In our notation, c1 customers and s1 servers are flexible, while c2 customers and s2 servers are inflexible. (s2 servers cannot serve c2 customers.) We assume service times are all independent exponential, with server-dependent rates. * Sn?i+1 * Sn The service rate of an s1 server is ?1; the service rate of an s2 server is ?2. See Fig. 1. We let ? = ?1 + ?2, n = n1 + n2. The service policy is FCFS?ALIS. The system is Markovian. In [2,3,21] the following state description for the skill-based parallel server systems under the FCFS?ALIS policy was used: imagine the customers arranged in a single queue by order of arrival, and servers are attached to the customers which they serve, and the remaining idle servers are arranged by increasing idle time in front of the queue; see Fig. 2. The state is then s = (S1, q1, S2, q2, . . . , Sn?i , qn?i , Sn?i+1, . . . , Sn ), where S1, . . . , Sn is a permutation of the n servers; the first n ? i servers are the ordered busy servers, and the last i servers are the ordered idle servers, and where q j , j = 1, . . . , n ? i , are the queue lengths of the customers waiting for one of the servers S1, . . . , S j , and skipped (could not be served) by servers S j+1, . . . , Sn . When service rates depend only on the servers, arrivals are Poisson, and services are exponential, this description is Markovian, as shown in [21]. The reason is as follows: given the permutation of servers, we know for each q j exactly what types of customers may be present, and since those customers are in the order in which they arrived, the type of each of them is randomly distributed according to the initial frequencies of customer types, and independent of all others. Hence, each server with a queue in front will have to go through an independent sequence of trials as he scans the customers FCFS until finding a match, and the specific sequences of customer types in the queues are not relevant to the steady state of the scan. This yields Markovian transition probabilities. For the special case of the N-system, in steady state, the following three random quantities are important: i1 = I1(s), the number of idle servers of type s1, i2 = I2(s), the number of idle servers of type s2, and k = K (s) ? 0, the number of servers of type s2 which follow the last server of type s1 in the sequence S1, . . . , Sn . An incoming c2 customer has to skip k s2 servers and find the last s1 server to be served. We let i = I (s) be the total number of idle servers in steady state. Because of the structure of the N-system and the FCFS?ALIS policy, the following properties hold for i = 0, . . . , n and k = 0, . . . , n2: (i) There are no customers waiting for any server which precedes the last s1 server in the permutation. In other words, for all j < min(n ? k, n ? i ) we have q j = 0. In particular, if there is an idle server of type s1 (meaning i > k), then there are no waiting customers at all. (ii) If there are any idle servers, then there are no type c1 customers waiting for service; in other words, if i > 0, then all the waiting customers are of type c2. (iii) If there are no idle servers (all servers are busy), then only the last queue can contain type c1 customers; in other words, if i = 0, then the last queue may contain customers of both types, but all the other waiting customers are of type c2. Denote Then a necessary and sufficient condition for stability is Throughout the paper, we assume the above stability condition. For the stable system, define ? as the long-term fraction of customers served by servers of type s1, and 1 ? ? the long-term fraction of customers served by servers of type s2. Since type s1 servers are the only ones that can serve type c2 servers, we must have ? ? 1 ? ?, or, equivalently, ? + ? ? 1. The stable system under FCFS?ALIS may operate in two different modes: it may be that servers of both types share the service of customers of type c1, in which case ? > 1 ? ? and we say that resource pooling occurs for large n, or it may be the case that servers of type s1 serve almost exclusively only customers of type c2, and almost all the service of customers of type c1 is done by servers of type s2, in which case ? ? 1 ? ? for large n, and we say that the system is decoupled. Using the results of [1,2] we can then write the exact stationary distribution of this system. We wish to show that, as the arrival rate and the number of servers increase, the system simplifies, and we get very precise many-server scaling limits, and in particular we find sharp conditions for pooled or decoupled modes of operation. We will investigate the behavior of the system when we fix the values of ?, ? , ?, and let n ? ?. To be precise, we shall then have n, n1 = ? n , n2 = n ? n1, ? = ?(?1n1 + ?2n2), ?1 = ??, ?2 = (1 ? ?)?, all of which go to infinity. Average processing times 1/?1, 1/?2 are fixed and not scaled. 3 Heuristic fluid calculations In this section we use some heuristic arguments to guess at the fluid behavior of the many-server system. In particular, we calculate a guess for some key quantities. Using these quantities we give a heuristic description of how the system will behave under the FCFS?ALIS policy, in the many-server case, distinguishing between pooled and decoupled modes of operation. The main part of the paper, in Sect. 4, is the verification of these guesses. We assume some fixed ? < 1, ? < 1 so that the system under FCFS?ALIS is stable. We then observe that under many-server scaling there will almost always be some idle servers available of both types and customers will almost never wait, so that they will enter service immediately upon arrival. At the same time, when a server completes a service there will almost never be any waiting customers, so, after almost every service completion, the server will experience some idle time. Because our policy is ALIS, when a server becomes idle, he always joins the end of a queue of idle servers. In a slight abuse of the notation, we reuse I1, I2 and K to denote, respectively, the stationary numbers of servers of type s1, s2 and the servers of type s2 which follow the last server of type s1 in s. When the system is stationary, the sample path of each server will consist of a sequence of cycles, each of which consists of a single service period followed by an idle period (which can be equal to 0). We denote the generic idle periods between services by Y1, Y2. We can bound the values of T1, T2 as follows: servers of type s2 can serve only customers of type c1, some of which may also be served by servers of type s1. Hence, the arrival rate per server is no larger than ?1/n2, and so the average interval between arrivals is no less than n2/?1, and the average service time per arrival is 1/?2, hence T2 ? n2/?1 ? 1/?2. Servers of type s1 serve all customers of type c2 and may in addition serve some customers of type c1. Hence, the arrival rate per server is no less than ?2/n1, and so the average interval between arrivals is no larger than n1/?2. The average service time per arrival is 1/?1; hence, T1 ? n1/?2 ? 1/?1. Hence, we have found that the stationary expected idle time satisfies n1 1 T1 = E (Y1) ? ?2 ? ?1 , We now distinguish three cases for the values of the parameters: Case I Case II Case III 1 n2 Case I In this case, by ( 1 ) we will have T2 > T1, and the system will decouple. The reasoning is as follows: because our policy is ALIS, each server, on completion of service, joins the end of the queue of idle servers, and his idle period consists of waiting until all the servers ahead of him who are of his type, as well as all the other servers that can serve customers who are compatible with him, are assigned to customers, and he is then assigned to the next compatible customer. At the end of his idle period, a server of type si has been idle for Yi , and he is then the longest idle server of his type. If we assume the idle times Yi converge to their means Ti as the system becomes large, i = 1, 2, then since T2 > T1, we can say that most of the time the longest idle server will be of type s2. Therefore almost all the arriving customers of type c1 will be assigned to a server of type s2, and so servers of type c1 will serve almost only customers of type c2. This implies that in Case I the system under many-server scaling will behave like two separate M /M /s queues. Because servers of type s2 serve almost all customers of type c1, and servers of type s1 serve all customers of type c2 and almost none of the customers of type c1, we have, for large n, ? + ? ? 1 and inequalities ( 1 ) will be close to equalities, and we will have (by Little?s law) We can also estimate the value of K , the location of the first type s1 server. Since service completions of customers of type c1 occur at rate ?1 and almost all of those are served by type s2, and service completions of customers of type c1 occur at rate ?1 and all of those and almost no others are served by type s2, servers of type s2 and s1 join the end of the queue of idle servers at the ratio of ?1/?2, so (I2 ? K )/I1 ? ?1/?2 and E (K ) ? E (I2) ? E (I1) ??21 = ?1 T2 ? T1) ? ?1 . It is worthwhile to note that the condition of Case I that implies decomposition is not simply that ? > ?, which is equivalent to n1??21 > n2??12 (the load of customers of type c2 on servers of type s1 is higher than the load of customers of type c1 on servers of type s2). In fact, under FCFS, servers of both types may share service of customers of type c1 even when ? > ?. To explain, when ? > ?, under decoupled service, the load and therefore the busy time percentage of type s2 servers is smaller than the load of type s1 servers, but, if ?1 < ?2, the idle time of type s2 servers (Y2) could be shorter than that of type s1 servers (Y1). In that case, under FCFS the work of c1 customers will be shared by both types of servers. The stationary behavior of the decoupled system is described in Fig. 3. In this figure we have, from left to right, a section of busy servers of both types serving all the customers in the system, followed by a section of more recent queueing idle servers of mixed types, followed by a section of the oldest idle servers, all of which are of type s2. Servers that complete service join the queue of idle servers at its left end. Arriving customers of type c1 pick the oldest waiting server, which is of type c2; arriving customers of type c2 skip all the K servers of type s2, and pick the oldest idle server of type s1. Note that the idle servers of both types are mixed in the middle section, and I2 = I1 + K . The exact limiting behavior under many-server scaling for Case I is derived in Sect. 4.4, where the heuristic calculations are verified. Our main results for Case I are: ? The probability that K = 0 converges to 0 as n ? ?, and so every customer of type c1 is served by a server of type s2. Busy servers ? The two sets of servers and their customers behave like independent M/M/n1 and M/M/n2 queues. Case II In this case, we argue that T1 ? T2 as n ? ?. Assume to the contrary that T1 > T2 as n ? ?. Then, for large n, we should have that most of the time the longest idle server will be of type s1. But s1 servers can serve all customers, and so by ALIS s1 servers will serve almost all the customers in the system, which is a contradiction. Now assume that T2 > T1 as n ? ?. But in that case we already argued that the system will decouple and so the inequalities in ( 1 ) will hold as equalities, which, since we are in Case II, contradicts T2 > T1. Therefore, there is no decoupling in Case II, and we conclude that, for large n, 1 n2 ?1 ? ?2 < T2 ? T1 < n1 Our first conclusion from T2 > n?21 ? ?12 is that servers of type s2 do not serve all the customers of type c1, so 1 ? ? < ?, i.e., ? + ? > 1, and from T1 < ?n21 ? ?1 1 we conclude that servers of type s1 serve some customers of type c1 as well as customers of c2 (again, ? > 1 ? ?). The following is a heuristic description of the behavior of the system in Case II under many-server scaling. When n increases, the (random) number of idle servers becomes large, of order O(n), and successive servers join the queue of idle servers at short intervals (of expected length 1/?, which is O(1/n)). They will spend a time of O( 1 ) to traverse the queue and will then reach the head of the queue of idle servers with short intervals between them. At this point they will need to wait for a compatible customer, and this waiting time does depend on the type of server, but because ? is large, once a server is at the head of the line his wait for a compatible customer will be short; hence, successive server arrivals to the idle queue are close to each other and so are their departures from the idle queue. So, as n ? ?, not only does T1 = T2, but also the idle times, Y1 and Y2, have the same distribution, and K is of order O( 1 ). This heuristic description will be verified in Sect. 4. We denote by T the presumed common value of T1 and T2. We now calculate the value of T . Let T be the average length of the idle time, common to all servers. The average cycle times will be 1/?1 + T and 1/?2 + T . We defined ? as the long-run fraction of services performed by s1 servers, with 1 ? ? services by type s2. The cycle rate of one type s1 server is 1/(1/?1 + T ); hence, the processing rate of all type s1 servers is n1/(1/?1 + T ), which should equal ??. Similarly, the flow rate out of all type s2 servers should equal ?(1 ? ?). That is, ?? = n1/(1/?1 + T ), ?(1 ? ?) = n2/(1/?2 + T ). Now we solve for T and ? to obtain ? = n1 1 ? 1/?1 + T , 1 ? ? = n2 1 ? 1/?2 + T ( 2 ) Busy servers Completed Service Arrivals and a quadratic equation for T : g(T ) = ??1?2T 2 + ?(?1 + ?2) ? (n1 + n2)?1?2 T + ? ? n1?1 ? n2?2 = 0. Here g(0) < 0 because ? < 1, so the equation has one positive and one negative root. Solving for positive T we get Note: for the case of ?1 = ?2 = ? we get T = 1??? ?1 . From T and Little?s law we can obtain mi , the approximate average number of idle servers in pool i, i = 1, 2: m1 = T ?? = T +T n11/?1 , m2 = T ?(1 ? ?) = T +T n12/?2 . When T1 = T2, servers are pooled. Servers share the load, and both types of customers receive similar levels of service. The pooled behavior of the system for FCFS?ALIS under many-server scaling is our main interest in this paper. Figure 4 shows the analog of Fig. 3 for the pooled system. Note that the idle servers of both types are mixed, and I2 = I1. Case III This case lies on the boundary of the other two cases. As a sanity check, on the one hand, we see that setting T1 = n?21 ? ?12 and T2 = n?21 ? ?12 would correspond to the values for Case I, and result in T1 = T2. On the other hand, considering the equation ( 2 ) for Case II, if we substitute n1 1 ? = ? 1/?1 + T = n1 1 n1 1 ?2 ? 1/?1 + T1 = ? 1/?1 + n1/?2 ? 1/?1 = ? = 1 ? ?, (3) (4) n2 1 1 ? ? = ? 1/?2 + T = n2 4 Many-server limit of the stationary distribution In this section, we keep the stability assumption ? < 1, ? < 1 and derive the manyserver limit from the exact stationary distributions. 4.1 Exact stationary distributions We first obtain the stationary distribution for each state s. We note that the stationary probabilities depend mainly on the values of k, i1, i2. Let ?(S j ) denote the service rate of the server at position j . Theorem 1 The stationary distribution of the state s of the FCFS?ALIS many-server N-system is given by , ? ?1 n?i2 ?(S j )? ?q j 2 j=n?k (?1n1 + ?2( j ? n1))q j +1 ? ?1 n?1 ?(S j )? , j=n?k (?1n1 + ?2?(q2jj ? n1))q j +1 (?1n1 +??qn2n2)qn+1 , ik1==0i,2. =.. ,0n,2, Proof This follows for all three parts of (5) by utilizing properties (i),(ii),(iii) in Sect. 2 and substituting into Equation (2.1), Theorem 2.1, in [2]. Before we manipulate Eq. (5), we introduce a lemma to facilitate the calculation. Lemma 1 Letting A1, . . . , Am denote a permutation of m given positive real numbers a1, . . . , am , we have (A1,...,Am )?P(a1,...,am ) l=1 m ? ? l j=1 A j ? ? ?1 = m l=1 al ?1 where P (a1, . . . , am ) denotes the set of all the permutations of a1, . . . , am . Now we can get the joint stationary distribution of K , I1, I2. We denote by ?(k, i1, i2) the stationary probability of K = k, I1 = i1 and I2 = i2. Theorem 2 The steady-state joint distribution of K , I1, I2 is given by where B1 is a normalizing constant. 4.2 The distribution of ( I1, I2) given K In this section we obtain the asymptotic distribution of (I1, I2) conditional on K = k, as n ? ?. We first show that, as n ? ?, the probability of no idle servers of type s1 goes to zero, and so the probability that customers need not wait goes to 1. Next p p we condition on K = k and show I1/n ?? f1, I2/n ?? f2, where m1 T ? f1 = n = T + 1/?1 , f2 = mn2 = TT (+1 1?/??)2 , where T is given in (3). Finally, we condition on K = k and show that the scaled and centered values of (I1, I2) converge in distribution to a bivariate normal distribution. Proofs of the following theorems can be found in the Appendix. Theorem 3 When n ? ?, there exists an > 0 such that P(I1 = 0) = o (exp(? n)) . From this theorem we see that when n ? ?, P(I1 > 0) ? 1. Therefore, P(K = k, I1 > 0) ? P(K = k) for any 0 ? k ? I2. From Eq. (6), given K = k, the limiting stationary distribution as n ? ? is P(I1 = i1, I2 = i2|K = k) ? P(I1 = i1, I2 = i2|K = k, I1 > 0) = B1 ni11 ni22 i1(i1 + i2 ? k ? 1)! (i2 i?2!k)! ?i11 ?i22 ??i1?i2?k ?1?k P(K1= k) . Theorem 4 Conditional on K = k, I1 , In2 converges to ( f1, f2) in probability for n any k ? 0. That is, for any > 0, when n ? ?, we have P (|I1 ? f1n| ? n or |I2 ? f2n| ? n|K = k) ? 0. After showing the fluid limit result, we are now ready to show the central limit result. where Theorem 5 For any k ? 0, when n ? ?, we have , (7) ? = , , 4.3 Case II: Pooled system Now we consider Case II, where n?21 ? ?12 < ?n21 ? ?11 . First we show the limit distribution of K , the location of the first type s1 server. Theorem 6 In Case II, for any k ? 0, as n ? ?, P(K = k) ? 1 ? Theorem 6 shows that K converges in distribution to a geometric distribution in Case II, so P(K < ?) = 1. Therefore, we can extend Theorems 4 and 5 into unconditional versions. Theorem 7 In Case II, as n ? ?, K becomes independent of I1 and I2. I1??nf1n , I2??nf2n converges in distribution to the bivariate normal distribution described in (10). Consider the special case when ?1 = ?2 = ?. Then ? = ?, f1 = (1 ? ?)? and f2 = (1 ? ?)(1 ? ? ). When n ? ?, I1?(?1?n?)n1 , I2?(?1?n?)n2 converges in distribution to a bivariate normal distribution with mean (0, 0), variance and correlation The total idleness has mean of (1 ? ?)n and variance of V ar (I1) + V ar (I2) + 2C ov(I1, I2) = ?n. 4.4 Case I: Decoupling to two independent systems We now assume n?21 ? ?12 > ?n21 ? ?11 , where we find that under many-server scaling the system decouples into two independent M /M /s service systems. We first show the following proposition: Proposition 1 In Case I, as n ? ?, we have P(? I1 ? (1 ? ?)I2) = o ?1n . We next obtain the conditional distribution K |(I1, I2). Theorem 8 Given I1 = i1n, I2 = i2n, where i1 ? (0, ? ), i2 ? (0, 1 ? ? ), and i2 > 1??? i1, we have ? i K ? i2 ? 1?? 1 n ?n (I1 = i1n, I2 = i2n) ? N 0, , as n ? ?. Therefore, given (1 ? ?)I2 > ? I1, P(K = 0|I1, I2) = o ?1n . Now we have . (9) P(K = 0) < P(K = 0|I1, I2) + P((1 ? ?)I2 ? ? I1) = o That means the number of type c1 customers served by s1 servers is no more than o(?n), which cannot affect the fluid scaled mean or the diffusion scaled variance of two independent decoupled systems. Theorem 9 In Case I, as n ? ?, ? ? I1 ? n1 ? ??21 , I2 ? ?n ?1 ? n2 ? ?2 ?n ? This is exactly the many-server scaling limiting distribution of the number of idle servers in two independent M /M /s queues, one of which has arrival rate ?2, service rate ?1, and n1 servers; the other has arrival rate ?1, service rate ?2, and n2 servers. Furthermore, K will then consist of I2 minus the idle servers of type s2 which are mingled with the I1 servers of type s1. The following calculation obtains the mean and variance of K under many-server scaling. We denote by I2,1 the number of idle servers of type s2 that are mingled with the I1 idle servers of type s1. Since the type s1 servers join the idle servers with rate ?2 and type s2 servers join the idle servers with rate ?1, we have I2,1 = Wi , I1 j=1 where Wi are i.i.d. random variables independent of I1, each of them having the distribution of the number of failures before the first success in a sequence of Bernoulli trials with probability of success ?1?+2?2 . We have E (Wi ) = ??21 , Var(Wi ) = , Var(I2,1) = E (I1) ?1(?1?22+ ?2) + Var(I1) ?1 ?2 = . Furthermore, as n ? ?, centered and scaled I2,1 converges to a normal distribution, and is independent of I2. It now follows that centered and scaled K also converges to a normal distribution, and centered and scaled (I1, I2, K ) converge to a multivariate normal distribution. The relevant parameters are ?1 E (K ) = E (I2) ? E (I2,1) = n2 ? ?2 ? , ?1 Var(K ) = Var(I2) + Var(I2,1) = ?2 + K is correlated with both I1 and I2: Cov(I2, K ) = Cov(I2, I2 ? I2,1) = Var(I2), ?1 Var(I1). Cov(I1, K ) = Cov(I1, I2 ? I2,1) = Cov(I1, ?I2,1) = ? ?2 4.5 Case III: Slowly decoupling as system becomes large As n ? ?, we have seen that when n?21 ? ?12 < ?n21 ? ?11 (Case II), then Kn in probability, and in fact K = O( 1 ); when n?21 ? ?2 1 Proposition 2 Keep all the other parameters fixed and change ?. If ?1 < ?2, then K?1 stochastically dominates K?2 . From the monotonicity and the previous statements for Cases I and II, we conclude: Corollary 1 In Case III, as n ? ?, Kn ? 0 in probability. We can in fact derive more precise asymptotic results for I1, I2, K in case III. We note first that the result of Theorem 5 on the limiting distribution of I1??nm1 , I2??nm2 K = k as n ? ?, for any fixed k, is valid not just in Case II, but also in Cases I and III. In the following theorem we investigate the limit, for fixed k, as n ? ?, of I1?nm1 , I2??m2 K = kn . ? n Theorem 10 For any k ? 0, 1 ? ? ? r???2?1 + , as n ? ?, we have I1 ? f1,k n , I2 ? f2,k n ?n ?n K = kn ? N 0, ?12,k ?k ?1,k ?2,k 2 ?k ?1,k ?2,k ?2,k , (10) where ?k = T ? T (1???k) where f1,k = T +1/?1 , f2,k = T +1/?2 + k, and T > 0 solves n1 1 ? 1/?1 + T + n2 ? kn ? 1 1/?2 + T = 1. Note that fi,0 equals fi , defined in Sect. 4.2, for i = 1, 2. So when k = 0, Theorem 10 agrees with Theorem 5. We can now use these results to obtain the centered and scaled limiting behavior of K in Case III. Theorem 11 In Case III, as n ? ?, ?Kn converges to a half truncated normal distribution with density function 2 The result of Theorem 11 in combination with Theorem 10 should in principle allow us to obtain the joint distribution of (I1, I2). Its centered and scaled limit is, however, not a bivariate normal distribution, and too messy to write down. Theorem 11 directly implies that P(K = 0) ? 0 as n ? ?. That means the proportion of type c1 customers who are served by type s1 servers goes to 0. Therefore, we can obtain the following fluid limit result: Corollary 2 In Case III, lim n?? I1 ? which is the same as in Case I. 4.6 Comparison to the bipartite FCFS infinite matching model The infinite matching model was defined and studied in [1,5,8] and is as follows: there are a set of customer types C = {c1, . . . , cI } and a probability vector ? = (?1, . . . , ?I ), a set of server types S = {s1, . . . , sJ } and a probability vector ? = (?1, . . . , ?J ), and a bipartite compatibility graph G ? C?S. There are two infinite sequences C 1, C 2, . . . where C m are i.i.d. drawn from C with probabilities ?, and S1, S2, . . . where Sn are i.i.d. drawn from S with probabilities ?. The two sequences are matched according to the compatibility graph, using FCFS. That is, C 1 is matched to the earliest Sn in the server sequence that is compatible with it, and thereafter C m is matched to the earliest Sn in the server sequence that is compatible with it, and that was not matched to one of the customers C 1, . . . , C m?1. This model is much simpler than a parallel servers queueing model; because there are no arrival times, no busy or idle servers (only a sequence of service types), and no processing times, only ordered customer types and ordered service types matched in the FCFS manner. This model is tractable: under a condition of complete resource pooling the system reaches a steady state, and in particular it is possible to calculate the matching rate for each compatible pair rs j ,ci , the frequency of matches that happen between server type s j and customer type ci . In the special case of the infinite matching model corresponding to the N-system, there are an infinite sequence of customers of types c1, c2, where the customer types are i.i.d., the type is c1 with probability ? and c2 with probability 1??, and an independent infinite sequence of servers of types s1, s2, where the server types are i.i.d., the type is s1 with probability ? and s2 with probability 1 ? ?, and the compatibility graph G has arcs {(c1, s1), (c1, s2), (c2, s1)}. The condition for complete resource pooling is then ? + ? > 1, corresponding to Case II in our queueing model. Based on the exact formula in [1], successive customers and servers are matched according to FCFS, with matching rates rc1,s1 = ? + ? ? 1, rc1,s2 = 1 ? ?, rc2,s1 = 1 ? ?. After n customers have arrived and been matched, there may be some unmatched s2 servers skipped by the customers. We define Kn to be the number of unmatched s2 servers before the first unmatched s1 server after the first n customers have been mSna+tc1hiesdo. fWtyepceasn1,seaen dthtahte(nKann)n?e=w1 ciussatoMmaerrkCovn+c1hawinil.lIbfeKmn a=tch0e,dthtoatSmn+ea1nasnsderwvielrl add a geometrically distributed number with parameter ? to Kn. If Kn > 0, then a new customer C n+1 of type c1 will reduce Kn by 1, and a new customer C n+1 of type c2 will add a geometrically distributed number with parameter ? to Kn. The steady-state distribution for this Markov chain is that P(K? = k) = 1 ? 1??? 1??? k , k ? 0, which is exactly the limiting distribution of K in (6). This supports our intuition that when the large N-system is underloaded with resource pooling in Case II, the replenishment of idle servers of types s1 and s2 becomes i.i.d with probability ? and 1 ? ?, respectively. In the infinite matching model, if complete resource pooling fails then there is a subset of customer types whose frequency is larger or equal to the frequency of all the compatible server types. In that case the infinite matching model will not reach steady state. However, in such cases there will be a unique decomposition of the model, so that each component on its own is an infinite matching model with complete resource pooling. In the case of the N-model this will happen when ? + ? ? 1, and then the model will decouple to two subsystems, one consisting of customers and servers of types c1, s2, and the other of customers and servers of types c2, s1. This is exactly the same decomposition that we observe in Cases I and III. 5 Numerical examples We test our results by investigating an N-system with ? = 100, n1 = n2 = 100, ?1 = ?2 = 1, ? = 0.5. In this example ? = 0.5, ??(1 ? ? + ??)n = (1 ? ? )?(1 ? ? + (1 ? ? )?)n = 37.5. We use the exact stationary distribution to verify this. We calculate the expectation and variance of the idle number in each pool exactly, listed in the following table. In this example ? = 0.5. When ? > 0.5 (Case II), so the average number of idle servers in each pool is close to 50, with variance close to ??(1 ? ? + ??)n = (1 ? ? )?(1 ? ? + (1 ? ? )?)n = 37.5; when ? < 0.5 (Case I), resource pooling disappears, and s1 servers seldom serve c1 customers. The N-system operates like two separate queues: s1 servers server c2 customers, and s2 servers serve c1 customers. The utilization of the s1 server pool is (1??)? , and the utilization of the n1 s2 server pool is ?n?2 . When ? = 0.4, almost zero portion of services performed by s1 servers are for c1 customers, the number of idle s1 servers can be approximated by ? 0.8 a normal distribution with mean n1 ? (1 ? ?)? = 40 and variance (1 ? ?)? = 60, whereas the number of idle s2 servers can be approximated by a normal distribution with mean n2 ? ?? = 60 and variance ?? = 40; when ? = 0.5 (Case III), we can see that the means are somewhat close to the fluid prediction 50, whereas we do not have analytic approximation for the variances (Table 1). Acknowledgements We are grateful to Ivo Adan for helpful discussion of this paper. We thank the anonymous reviewer and the associate editor for their constructive comments, which helped us improve the manuscript. The review team noticed that analyzing only Case II left major gaps in the original version, which resulted in the addition of the analysis of Cases I and III. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. A Appendix: Proofs for Sect. 4.1 Proof of Lemma 1 We prove this lemma by induction. Define the left-hand side as Cm . C2 = Cm = = = 1 a1(a1 + a2) + a2(a1 + a2) = a1a2(a1 + a2) = a1a2 1 . 1 m ? A j ? a1 + a2 ? ?1 ( A1,...,Am )?P(a1,...,am ) l=1 m 1 m l=1 al p=1 ( A1,...,Am?1)?P(a j : j = p) l=1 j = p a j ? ? ?1 = 1 m l=1 al A j ? l j=1 m p=1 a p m j=1 a j ? ?1 = m l=1 al ?1 . , 1 , j=n?k ?1n1 + ?2( j ? n1) ? ?2 n?1 1 1 , k = 0, . . . , n2, j=n?k ?1n1 + ?2( j ? n1) ? ?2 ?1n1 + ?2n2 ? ? i1 = i2 = 0. (11) Each permutation of the remaining servers, S j , n ? max{k + 1, i1 + i2} < j ? n has the same stationary probability. It remains to count the number of permutations. When i1 = 0 we have i2 ? k. For each permutation we choose 1 type s1 server and k out of n2 type s2 servers to form the last k + 1 servers. The number of permutations is Proof of Theorem 2 Summation over the geometric terms q j = 0, . . . , ? in (5) gives q1,...,qn?i , , Next we see that in this expression, permutations of S1, . . . , Sn with the same (k, i1, i2) have a similar structure. We now sum over all the permutations of the appropriate S j , 1 ? j ? n ? max{k + 1, i1 + i2}. By Lemma 1 we obtain k = 0, . . . , n2, i1 = 0, . . . , n1 i2 = k, . . . , n2, k = 1, . . . , n2, i1 = 0, i2 = 1, . . . , k, When i1 > 0, we have i2 ? k. For each permutation, we choose i1 out of n1 type s1 servers and i2 out of n2 type s2 servers. We then choose 1 from the i1 idle servers of type s1, and k from the i2 idle servers of type s2 to obtain the last k + 1 servers. The number of permutations is n1 i1 n2 i i2 i 1 k2 (i1 + i2 ? k ? 1)!k! = n1 i1 n2 i1i2!(i1 + i2 ? k ? 1)! . i2 (i2 ? k)! Multiplying the terms in (11) by the appropriate number of permutations and defining B1 = B?1?n1 ??n2 gives (6). 2 B Appendix: Proofs for Sect. 4.2 Proof of Theorem 3 We prove the theorem in three steps: (i) We show that P(I1 = 0, I2 = 0) = ?(k, 0, 0) n1 n2! ?1?k2 B1 (n2 ? k)! n?1 j=n?k We use induction to calculate from m = n2 to m = 1. When m = n2, , 1 ?1n1 + ?2( j ? n1) ? ?2 ?1n1 + ?2n2 ? ? 1 . where ? means the ratio of the two sides converges to 1 when n ? ?, m1 and m2 are defined in (4). Note that the definition in (4) does not require a specific ? . case. And for all cases, we have mm21 = 1?? (iii) We show that, as n ? ?, k=m (n2 ? k)! j=n?k ?1n1 + ?2( j ? n1) ? ?2 ? ?2? n2 exp (n2 (? log ? + ? ? 1)) , 0 < ? < 1, P (I1 = 0) ? B1 1 ?1 ? ? ?? ?1?2(1?1??n??2)/?2+, ? ?11 , ?? >= 11., m1 ( m1 + m2 ? 1)!?1m1 ?2m2 1 m1 + m2 ? B1 2??n1n2 = n5?1?2 (n1 ? m1)(n2 ? m2)m2 B1 2??n1n2 = n5?1?2 (n1 ? m1)(n2 ? m2)m2 m2 ? exp ?n2 log 1 ? n2 . (m1 + m2)!?1m1 ?2m2 ??m1?m2 nn11 nn2 2 (n1 ? m1)n1?m1 m1m1 (n2 ? m2)n2?m2 m2m2 The second equality is due to m1m+1m2 = ?, m1m+2m2 = 1 ? ?, ?1(n1??m1) = ?, 1/2 When n ? ?, note that (n1?m21?)?(nn21?n2m2)m2 is of the order of n?1/2. Therefore, P(I1 = m1 , I2 = m2 , K = 0)/B1 increases exponentially. When ? > 1, P(I1 = 0)/B1 converges to a constant; when ? = 1, P(I1 = 0)/B1 increases in the order of ?n. Therefore, when n ? ? and ? ? 1, . 1/2 We have that which is nonpositive no matter whether ? + ? is larger than, equal to, or small than 1. Therefore, when n ? ?, Proof of Theorem 4 First we show that the weak convergence is valid given K = 0. Then we show that the same holds when K = k, for any fixed k. When K = 0, we prove the convergence in probability in two steps: (i) We show that for all states |I1 ? m1| ? n or |I2 ? m2| ? n, the conditional probability is dominated by a bounded constant multiple of the conditional probability of some point on the boundary of the rectangle |I1 ? m1| ? n ? |I2 ? m2| ? n. (ii) When n ? ?, we approximate the conditional probability of the points in the rectangle |I1 ? m1| ? n ? |I2 ? m2| ? n. We then show that the probability of points on the boundary is negligible compared with the conditional probability at ( m1 , m2 ). Proof of (i): P (I1 = i1, I2 = i2|K = 0) n1 = B2 i1 , where B2 = B1/ P (K = 0). P (I1 = i1 + 1, I2 = i2|K = 0) P (I1 = i1, I2 = i2|K = 0) P (I1 = i1, I2 = i2 + 1|K = 0) P (I1 = i1, I2 = i2|K = 0) = = ? when i1 ? m1 and (1 ? ?)i1 < i1 P(I1=i1+1,I2=i2|K =0) > 1; P(I1=i1,I2=i2|K =0) ? when i2 ? m2 and (1 ? ?)i1 > ?i2 + 1, we have ii12++i12 > P(I1=i1,I2=i2+1|K =0) > 1; P(I1=i1,I2=i2|K =0) ? when i1 > m1, i2 > m2 and (1 ? ?)i1 ? ?i2, we have i1+i2 i1 ?i2, we have i1+i2 ?1 . Therefore, ? ?1 . Therefore, P(I1=i1+1,I2=i2|K =0) < 1; P(I1=i1,I2=i2|K =0) ? when i1 > m1, i2 > m2 and (1 ? ?)i1 ? ?i2 + 1, we have ii12++i12 ? 1?? 1 . Therefore, P(I1=i1,I2=i2+1|K =0) < 1; ? wPh(eIn1=?i1i,2I2?=i2(|1K =?0)?)i1 ? ?i2 + 1, i1 ? m1 ? n and i2 ? m2 ? n, as long as n2?m2 i2+1 > 1, we have P(I1=i1,I2=i2+1|K =0) > 1. When n is large, this requires n2?i2 i2 P(I1=i1,I2=i2|K =0) For all i1 > i1? or i2 > i ?, we can move the state to a neighbor state with larger 2 steady-state probability, as shown in Fig. 5. 1 ? ? ? f2 . f2 ? i1 > i1? = f . 1 As long as nn11??mi11 i1i?11 > 1, we have P(I1=i1+1,I2=i2|K =0) > 1. When n is large, P(I1=i1,I2=i2|K =0) this requires (1-?)i1=?i2+1 (n1,n2) i1 Eventually the movement stops at the boundary which is n away from (m1, m2). Therefore, the probability of any state (i1, i2) satisfying i1 > i1? or i2 > i2? would be dominated by the probability of some point at the boundary. For any (i1, i2) satisfying i1 ? i1? and i2 ? i2?, since When i1 ? [m1 ? n, m1 + n] and i2 ? [m2 ? n, m2 + n], and n grows large, we can use Stirling?s approximation. P(I1 = i1, I2 = i2|K = 0) n1 = B2 i1 = B2 i1 +i1 i2 (n1 ? i1)!ni11!!(nn22! ? i2)!i2! ? B3 (i1 + i2)(n1 ?i1i1)(n2 ? i2)i2 (i1 + i2)! = B3 ?(n1 ? i1) log(n1 ? i1) ? i1 log(i1) ?(n2 ? i2) log(n2 ? i2) ? i2 log(i2) + i1 log + i2 log ?1 ? ?2 ? ? i1 ? i2 where B3 = B2n1!n2!(2? )? 23 en, x1 = in1 , x2 = n i2 . We have x1 ? [ f1 ? , f1 + ] and x2 ? [ f2 ? , f2 + ]. We define F (x1, x2) = (x1 + x2) log(x1 + x2) ? (? ? x1) log(? ? x1) ?(1 ? ? ? x2) log(1 ? ? ? x2) +x1(log ?1 ? log r ? log x1) + x2(log ?2 ? log r ? log x2) ? x1 ? x2. The first-order derivatives on x1 and x2 are ? F ? x1 ? F ? x2 ?1 = log(x1 + x2) + log(? ? x1) ? log(x1) ? log r = 0, ?2 = log(x1 + x2) + log(1 ? ? ? x2) ? log(x2) ? log r = 0. Solving the first-order conditions gives Consider the second-order derivatives: x1 = f1, x2 = f2. ?2 F 1 1 1 ? x12 = ? ? ? x1 ? x1 + x1 + x2 ?2 F 1 1 1 ? x22 = ? 1 ? ? ? x2 ? x2 + x1 + x2 < 0, < 0, ?2 F ? x1? x2 = x1 + x2 2 1 , The Hessian matrix is negative definite. Therefore, F (x1, x2) is strictly concave on (0, ? ) ? (0, 1 ? ? ) and reaches its unique global maximum at ( f1, f2). The maximum of F (x1, x2) on [?, ? ? ?] ? [?, 1 ? ? ? ?]\( f1 ? , f1 + ) ? ( f2 ? , f2 + ) is on the boundary {(x1, x2)||x1 ? f1| = , |x2 ? f2| = }. Since the boundary is a compact set, the maximum is attainable, denoted by F ( f1, f2) ? ?, where ? > 0. Note that changes slowly when x1 and x2 change, compared with exp(nF(x1, x2)). We have P(I1 = i1, I2 = i2|K = 0) P(I1 = m1 , I2 = m2 |K = 0) ? exp(n(F(x1, x2) ? F( f1, f2))) < exp(?n). Therefore, 1/2 n?1 |i1?m1|> n or |i2?m2|> n P(I1 = i1, I2 = i2|K = 0) = (i1 +(ii22?+k1)(?n2k?)?i2)?2 = (1 ? ?)nn22??mi22 ii12++i12??kk . We can use a similar two-step argument to show that (I1/n, I2/n) converges to ( f1, f2) in probability given K = k. Proof of Theorem 5 To obtain the asymptotic distribution of I1, I2 as n ? ?, we need to consider, by Theorem 4, only values i1,i2 for which (i1 ? m1)/n ? 0 and (i2 ? m2)/n ? 0. We write i1 = m1 + z1?n, i2 = m2 + z2?n, with z1/?n ? 0, z2/?n ? 0. Note that m1, m2, n1?m1, n2?m2 are of the same order of magnitude as n,n1,n2, and we only consider i1,i2 of the same order of magnitude. = B2 i1 (i1 + i2)i2(n1 ? i1)(n2 ? i2) 1/2 , (i1 + i2)!?i11 ?i22 ??i1?i2 B1/ P(K = 0) and B3 = B2n1!n2!(2? )? 23 en. We clearly have where the use of Stirling?s approximation is justified for large n. Here B2 = i1 (i1 + i2)i2(n1 ? i1)(n2 ? i2) 1/2 so we can treat that part as a constant. Consider (m1 + m2)m2(n1 ? m1)(n2 ? m2) 1/2 , i 1i1 = (m1 + z1?n)m1+z1?n = m1m1+z1?n 1 + Then from the Taylor expansion of the logarithm function, we have log 1 + = (m1 + z1?n) = (m1 + z1?n) log 1 + = z1?n + 2zm12n1 + o( 1 ). m1 Therefore, Similar expansions are valid for i2, n1 ? i1, n2 ? i2 and i1 + i2: z2n log(i 2i2 ) ? m2 log(m2) + z2?n(log(m2) + 1) + 2m2 2 . log((ni ? i1)n1?i1 ) ? (n1 ? m1) log(n1 ? m1) log((n2 ? i2)n2?i2 ) ? (n2 ? m2) log(n2 ? m2) ? z1?n(log(n1 ? m1) + 1) + 2(n1z12?nm1) . ? z2?n(log(n2 ? m2) + 1) + 2(n2z22?nm2) . log((i1 + i2)i1+i2 ) ? (m1 + m2) log(m1 + m2) We now use the calculations in Sect. 3 to evaluate all the ?n coefficients. By (4) we have Therefore, we have , m1 where B4 = log B3 (m1+m2)m2(n1?m1)(n2?m2) 1 (? ? f1) f1((1 ? ? ) f1 + f22) 2 , ? f22 + (1 ? ? ) f12 1 (1 ? ? ? f2) f2(? f2 + f12) 2 . ? f22 + (1 ? ? ) f12 1 2 , P(I1 = i1, I2 = i2|K = 0) 1 ? exp(B4) exp ? 2(1 ? ?2) z12n z22n ?12 + ?22 ? 2?z1z2n ?1?2 . Therefore, ( I1??nm1 , I2??nm2 ) given K = 0 converges in distribution as n ? ? to the bivariate normal distribution as stated in (10). When K = k > 0, and n ? ?, similarly, We can now use the same approximation as for k = 0 to show that I1??nm1, I2??m2 n converges to the same bivariate normal distribution. C Appendix: Proofs for Sect. 4.3 Proof of Theorem 6 From (4), f2 m2 T?(1 ? ?) f1 + f2 = m1 + m2 = T?? + T?(1 ? ?) = 1 ? ?. Take a fixed arbitrary ? (0,min{ f1, f2}). Fix k > 0. For any i1,i2 satisfying |i1/n ? f1| < , |i2/n ? f2| < and i1 ? 1, from (6), noting ab++cc ? ab for any 0 < a ? b and c > 0, we have Therefore, For fixed k0 > 0, 1?? k0 1 + f2 + f12n k0 P (K ? k0, I1 = i1, I2 = i2) < ?(0,i1,i2) ? 1 ? 1??? 1 + f2 + f12n . Notetheaboveinequalityisvalidforanyi1,i2 satisfying|i1/n? f1| < ,|i2/n? f2| < . We have P (K ? k0,|I1/n ? f1| < ,|I2 ? f2| < ) < P (K = 0,|I1/n ? f1| < ,|I2 ? f2| < ) 1 ? 1??? 1 + f2 + f12n . 1?? k0 1 + f2 + f12n k0 ? From Theorem 4, there exists an N1 such that, when n > N1, P(|I1/n ? f1| < ,|I2/n ? f2| < ) > 1 ? . Then we have, P(K ? k0) < P (K ? k0||I1/n ? f1| ? ,|I2/n ? f2| ? ) ? P (|I1/n ? f1| < ,|I2/n ? f2| < ) + P (|I1/n ? f1| ? ,|I2/n ? f2| ? ) < P (K = 0||I1/n ? f1| ? ,|I2/n ? f2| ? ) 1?? k0 1 + f2 + f12n k0 ? This upper bound can be arbitrarily close to 0 when choosing , n > N1, and k0. Therefore, we have shown the tightness of K; that is, Using ? lim P (K = k) = 1. k=0 n?? P(K = k) = P (K = k,|I1/n ? f1| < ,|I2/n ? f2| < ) +P (K = k,|I1/n ? f1| ? ,|I2/n ? f2| ? ), (13) for fixed k > 0, when n > N1, the ratio PP(K(K==k?k)1) is lower bounded by P (K = k,|I1/n ? f1| < ,|I2/n ? f2| < ) + . P (K = k ? 1,|I1/n ? f1| < ,|I2/n ? f2| < ) For any i1,i2 satisfying |i1/n ? f1| < , |i2/n ? f2| < and i1 ? 1, in addition to (12), we have the lower bound ( f(2f1?+)fn2)?nk?+k1 ?1, ( f(2f1++)fn2)+n1 ?1 . Now we have Therefore, that is, |i1/n?f1|< ,|i2/n?f2|< ?(k,i1,i2) |i1/n?f1|< ,|i2/n?f2|< ?(k ? 1,i1,i2) ? ( f(2f?1+)nf2?)nk?+k 1 ?1, ((f2f1++ )fn2)+n1 ?1 , P (K = k,|I1/n ? f1| < ,|I2/n ? f2| < ) P (K = k ? 1,|I1/n ? f1| < ,|I2/n ? f2| < ) ( f2 ? )n ? k + 1 1 ( f2 + )n + 1 1 ? ( f1 + f2)n ? k ?, ( f1 + f2)n ? . (14) For fixed k, as n ? ?, the lower bound and the upper bound in (14) both converge to 1???. Noting that can be arbitrarily close to 0, we have nl?im? P(K = k ? 1) = 1 ?? ?. P(K = k) This, together with the tightness (13), proves (8). Proof of Theorem 7 When ?n21 ? ?11 > n?21 ? ?12, the unscaled K converges to a geometric distribution. As we saw in Theorems 4 and 5, as n ? ?, the distribution of the scaled deviations of I1, I2 conditional on the value of K = k converges to a normal distribution, with mean and variance that do not depend on k. We can now use the law of total probability and find N0 large enough so that the unconditional probability distribution of the scaled I1, I2 is close to the specified normal distribution when n > N0. One more step then shows that, as n ? ?, the conditional distribution given K is the same, so we have the asymptotic independence. D Appendix: Proofs for Sect. 4.4 Proof of Proposition 1 Let A1(t ) be the arrival stream of customers that are served eventually by servers of type s1, and let I1(t ) be, as defined above, the number of idle servers of type s1. We now compare this to an M /M /n1 system, with type s1 servers, whose processing times are exponential with rate ?1, and with arrival stream A?1(t ) which consists of all the arrivals of the stream A1(t ) which are customers of type c2, but excludes arrivals of type c1. Clearly, A1(t ) ? A?1(t ) a.s. Denote by I?1(t ) the number of idle servers in the M /M /n1 system at time t . It then follows directly from Theorem 1 of Shanthikumar and Yao [18] that the stationary distributions of I1 and I?1 satisfy I?1 ?ST I1. Define similarly an M /M /n2 system with type s2 servers, whose processing times are exponential with rate ?2 and arrivals A?2(t ) of all the customers of type c1. Then A2(t ) ? A?2(t ) a.s. and, by the same argument, I?2 ?ST I2. As n becomes large, the numbers of idle servers in the two independent M /M /N systems (I?1(?), I?2(?)) can be approximated by normal distributions with means ??12 , respectively. Since, in Case I, c = ?2 ? ?1 we have = ?(1 ? ?)?c = O(n), while the standard ?1 M = (1 ? ?) n2 ? ?2 deviations are ?2 + ? n1 ? ?1 o ?1n and P((1 ? ?)I?2 ? M ) = o ?1n . Therefore, O(?n). Define the middle point 2. As n ? ?, we have P(? I?1 ? M ) = P(? I1 ? (1 ? ?)I2) ? P(? I1 ? M ) + P((1 ? ?)I2 ? M ) ? P(? I?1 ? M ) 1 + P((1 ? ?)I?2 ? M ) = o ?n . Proof of Theorem 8 Given i1 ? (0,?),i2 ? (0,1 ? ?), and i2 > 1???i1, for 0 ? k < i2, P(K = kn|I1 = i1n, I2 = i2n) n1 n2 i1n(i2n)!(i1n + i2n ? kn ? 1)!?i11n?i22n??i1n?i2n??kn = B2 i1n i2n (i2n ? kn)! ? B3 ((i1 + i2 ? k)(i2 ? k))?1/2 ((i1 + i2 ? k)n)(i1+i2?k)n ??kn, ((i2 ? k)n)(i2?k)n where B2 = B1/P(I1 = i1n, I2 = i2n), B3 = B2 in11n in22n i1(i2n)! ??1 i1n ?2 i2n ? exp(?i1n). Choose k to maximize ((i1 + i2 ? k)n)log((i1 + i2 ? k)n) ? ((i2 ? k)n)log((i2 ? k)n) ? kn log?. ?K = (1 ??i1?)2. 2 +2(i1 + i2 ? k?) ((i1 + i2)n ? (k?n + x?n))log((i1 + i2)n ? (k?n + x?n)) = ((i1 + i2 ? k?)n)log((i1 + i2 ? k?)n) ? x?n(log((i1 + i2 ? k?)n) + 1) x2 ,(i2n ? (k?n + x?n))log(i2n ? (k?n + x?n)) x2 = ((i2 ? k?)n)log((i2 ? k?)n) ? x?n(log((i2 ? k?)n) + 1) + 2(i2 ? k?). log P(K = k?n + x?n|I1 = i1n, I2 = i2n) i1x2 P(K = k?n|I1 = i1n, I2 = i2n) ? ?2(i1 + i2 ? k?)(i2 ? k?) x2 = ?2?i1/(1 ? ?)2. Therefore, K??k?n I1 = i1n, I2 = i2n isanormaldistributionwithmean0andvariance n ?log((i1 + i2 ? k)n) + log((i2 ? k)n) ? log? = 0. k = k? = i2 ? 1 ?? ?i1. The first-order condition is Therefore, the optimal value is Given K = k?n + x?n, Therefore, Proof of Theorem 9 From Theorem 8, we know that, as n ? ?, the percentage of type c1 customers served by type s1 servers goes to 0 faster than O( ?1n ). Therefore, as n ? ?, the two server pools are decoupled in the sense that type s1 servers serving type c1 customers do not affect the fluid and diffusion limits of the decoupled systems. From the proof of Proposition 1, we know I1 ? (n1 ? ??21 ) /?n converges to a normal distribution with mean 0 and variance n??21 ; independently, I2 ? (n2 ? ??12 ) /?n converges to a normal distribution with mean 0 and variance n??12 . E Appendix: Proofs for Sect. 4.5 Proof of Proposition 2 We want to show PP((KK ((??21))==kk)) is decreasing in k. Note that n1 n2 i1=1 i2=k k i2=1 P(K (?) = k) = ?(k, i1, i2) + ?(k, 0, i2) + ?(k, 0, 0). From Theorem 2, given k, for any i1 ? {1, . . . , n1}, i2 ? {k, . . . , n2}, ??2 (k, i1, i2) B1(?2) ??1 (k, i1, i2) = B1(?1) ?1 k , which is decreasing in k; for any i2 = {1, . . . , k}, ??2 (k, 0, i2) ??1 (k, 0, i2) = B1(?1) j=n?k ?1n1 + ?2( j ? n1) ? (1 ? ?2)? B1(?2) n?i2 ?1n1 + ?2( j ? n1) ? (1 ? ?1)? ?1 i2 , which is decreasing in k; n?1 ??2 (k, 0, 0) B1(?2) ?1n1 + ?2( j ? n1) ? (1 ? ?1)? , ??1 (k, 0, 0) = B1(?1) j=n?k ?1n1 + ?2( j ? n1) ? (1 ? ?2)? which is decreasing in k. Therefore, PP((KK ((??21))==kk)) is decreasing in k, that is, P(K (?1) = k + 1) P(K (?2) = k + 1) P(K (?1) = k) P(K (?2) = k) , meaning K (?1) is larger than K (?2) in the likelihood ratio order, implying K (?1) stochastically dominates K (?2). Proof of Corollary 1 The condition n?21 ? ?2 = ?n21 ? ?11 is equivalent to ? = 1 ? ?. 1 By Proposition 2 K?1 ?ST K1?? ?ST K?2 whenever ?1 < 1 ? ? < ?2. But, for all 1 ? ? < ?2, K?2 /n ? 0, and for ?1 < 1 ? ?, lim?1?1?? limn?? K?1 /n = 0, and the corollary follows. Proof of Theorem 10 We prove this theorem in two steps: ? Prove that fluid limits are limn?? I1/n = f1,k, limn?? I2/n = f2,k. ? Prove the central limit behavior. When i1 ? [m1 ? n,m1 + n] and i2 ? [m2 ? n,m2 + n], and n grows large, we can use Stirling?s approximation: (i1 + i2 ? kn)i1+i2?kn exp(?i1 ? i2) (n1 ? i1)n1?i1i1i1(n2 ? i2)n2?i2(i2 ? kn)i2?kn ? ?1 i1 ?2 i2 1/2 ? i1 = B3,k (i1 + i2 ? kn)(n1 ? i1)(n2 ? i2)(i2 ? kn) exp (i1 + i2 ? kn)log(i1 + i2 ? kn) ? (n1 ? i1)log(n1 ? i1) ?i1 log(i1) ? (n2 ? i2)log(n2 ? i2) ? (i2 ? kn)log(i2 ? kn) +i1 log ??1 + i2 log ??2 ? i1 ? i2 = B4,k exp n (x1 + x2 ? k)log(x1 + x2 ? k) ? (? ? x1)log(? ? x1) ? x1 log(x1) ? (1 ? ? ? x2)log(1 ? ? ? x2) ?(x2 ? k)log(x2 ? k) + x1 log ?1 r + x2 log ?r2 ? x1 ? x2 , where B2,k = B1/P(K = kn), B3,k 1/=2 B2,kn1!n2!(2?)?23en??kn, B4,k = B3n?n?3/2 (x1+x2?k)(??x1x)1(1???x2)(x2?k) , x1 = in1, x2 = in2. We define F(x1, x2) = (x1 + x2 ? k)log(x1 + x2 ? k) ? (? ? x1)log(? ? x1) ?(1 ? ? ? x2)log(1 ? ? ? x2) + x1(log?1 ? logr ? log x1) +x2(log?2 ? logr ? log x2) ? x1 ? x2. The first-order derivatives on x1 and x2 are ?F ?x1 = log(x1 + x2 ? k) + log(? ? x1) ? log(x1) ? log ?r1 = 0, ?F ?x2 = log(x1 + x2 ? k) + log(1 ? ? ? x2) ? log(x2 ? k) ? log ?r2 = 0. We can solve Look at the second-order derivatives: x1 = f1,k , x2 = f2,k . ?2 F ?2 F ? x1? x2 = x1 + x2 ? k 2 , 1 1 ?2 F 1 1 1 ? x12 = ? ? ? x1 ? x1 + x1 + x2 ? k < 0, 1 1 ? x22 = ? 1 ? ? ? x2 ? x2 ? k + x1 + x2 ? k < 0, ?2 F The Hessian matrix is negative definite. Therefore, F (x1, x2) is strictly concave on (0, ? ) ? (0, 1 ? ? ) and reaches its unique global maximum at ( f1,k , f2,k ). Similar to the proof of Theorem 4, we can show that lim I1 n?? n ? f1,k , lim I2 n?? n ? f2,k . To obtain the asymptotic distribution of I1, I2 as n ? ?, we only need to consider i1, i2 for which i1/n ? f1,k and i2/n ? f2,k . Similar to the proof of Theorem 5, we write i1 = f1,k n + z1?n, i2 = f2,k n + z2?n, with z1/?n ? 0, z2/?n ? 0. P(I1 = i1, I2 = i2|K = kn) ? i1 B3,k (i1 + i2 ? kn)(n1 ? i1)(n2 ? i2)(i2 ? kn) 1/2 exp (i1 + i2 ? kn) log(i1 + i2 ? kn) ? (n1 ? i1) log(n1 ? i1) ? i1 log(i1) ? (n2 ? i2) log(n2 ? i2) ? (i2 ? kn) log(i2 ? kn) ?1 ?2 + i1 log ? + i2 log ? ? i1 ? i2 . From the definitions of f1,k and f2,k in Theorem 10, (? ? f1,k )?1 ? f1,k (1 ? ? ? f2,k )?2 ?( f2,k ? k) 1 = n( f1,k + f2,k ? k) where (z1 + z2)2 log(P(I1 = i1, I2 = i2|K = kn)) ? B5,k + 2( f1,k + f2,k ? k) f1,k where B5,k = log B3,k (f1,k+f2,k?k)(f2,k?k)(??f1,k)(1???f2,k) n(? log(? ? f1,k) + (1 ? ?)log(1 ? ? ? f2,k) + f1,k + f2,k) + kn(log( f2,k ? k) ? log( f1,k + f2,k ? k)). Therefore, organizing the formula, we have 1/2 ? 25n logn ? 2 ?k?1,k?2,k , I1 ? f1,kn, I2 ? f2,kn K = kn ? N 0, ??1k,?k1,k?2,k ?2,k ?n ?n 2 f12,k(1 ? ? ? k) + ( f2,k ? k)2? , Proof of Theorem 11 The density of the highest point of the approximating binormal distribution in Theorem 10 is 1 2?(1 ? ?k2)?1,k?2,k ( f2,k ? k)2? + f12,k(1 ? ? ? k) . = 2? f1,k( f2,k ? k)(? ? f1,k)(1 ? ? ? f2,k)( f1,k + f2,k ? k) For a large n, letting x be the ceiling of a real number x, and from Theorem 10, P(I1 = f1,kn , I2 = f2,kn |K = kn) 1 ( f2,k ? k)2? + f12,k(1 ? ? ? k) . ? n 2? f1,k( f2,k ? k)(? ? f1,k)(1 ? ? ? f2,k)( f1,k + f2,k ? k) Combined with (15), we have ( f2,k ? k)2? + f12,k(1 ? ? ? k) B5,k ? ?logn + 21 log 2? f1,k( f2,k ? k)(? ? f1,k)(1 ? ? ? f2,k)( f1,k + f2,k ? k) . Recall that 1/2 ? 25n logn f1,k + 21 log ( f1,k + f2,k ? k)( f2,k ? k)(? ? f1,k)(1 ? ? ? f2,k) 5n logn ? n(? log(? ? f1,k) + (1 ? ?)log(1 ? ? ? f2,k) + f1,k + f2,k) +n ? 2 +kn(log( f2,k ? k) ? log( f1,k + f2,k ? k)) ? kn log?. Therefore, where Define log P(K = kn) ? B6 + log( f1,k) ? 21 log ( f2,k ? k)2? + f12,k(1 ? ? ? k) ?n ? log(? ? f1,k) + (1 ? ?)log(1 ? ? ? f2,k) + f1,k + f2,k ? k log( f2,k ? k) +k log( f1,k + f2,k ? k) + k log? , B6 = log B1n1!n2!(2?)?1 + n ? 25n logn ? 23 logn. G(k) = ? log(? ? f1,k) + (1 ? ?)log(1 ? ? ? f2,k) + f1,k + f2,k ?k log( f2,k ? k) + k log( f1,k + f2,k ? k) + k log?. From Theorem 10, we can denote k by T, k = (?1 ? ?2)? ? (1 + ?1T)(r ? ?2 + r?2T). ?2(1 + ?1T) Note that T is nonnegative and no larger than the value in (3), denoted by T. Note also that f1,k = T+1/?1, f2,k = TT(1+?1?/??2k) + k. Algebra gives T? dG r?2(1 + ?1T)2 + ?1(?1 ? ?2)? log r ? 1+??1?1T ? log(?r) dT = ?2(1 + ?1T)2 Solving ddGT = 0 gives If then ? 1 T = (1 ? ?)r ? ?1 n1 1 ?2 ? ?1 , T < n1 1 ? 1 ?2 ? ?1 = (1 ? ?)r ? ?1 n1 1 ? 1 T ? ?2 ? ?1 = (1 ? ?)r ? ?1 k = ?r lim P n?? K n ? k = 0. , , and G(T ) is minimized at T = T ; otherwise, and G(T ) is minimized at T = (1???)r ? ?11 . When G(T ) is minimized at T = T , the corresponding k = 0, we go back to the pooled system case; when G(T ) is minimized at T = (1???)r ? ?11 , the corresponding By now we have shown that, for any > 0, Therefore, lim P n?? I1 n ? f1 > f1 = ? ? = 0, lim P n?? (1 ? ?)r = 0, This is consistent with our intuitive calculation in Sect. 3. Suppose xT changexs2 from T = (1???)r ? ?11 to T +xx /?n; thx2en f1,k changes ? f1 = f1,k ?n + f1,k 2n + o n1 , f2,k changes ? f2 = f2,k ?n + f2,k 2n + o n1 , and k 2 changes ?k = k ?xn + k 2xn + o n1 , nG(T) = n1 log(? ? f1,k) + n2 log(1 ? ? ? f2,k) + f1,kn + f2,kn ? kn log( f2,k ? k) +kn log( f1,k + f2,k ? k) + kn log? ?f1 (?f1)2 = n1 log(? ? f1 ) ? ? ? f1 ? 2(? ? f1 )2 ?f2 (?f2)2 +n2 log(1 ? ? ? f2 ) ? (1 ? ? ? f2 ) ? 2(1 ? ? ? f2 )2 ? ? f1,k ? (11?????)f2f,2k + f1,k + f2,k + k log? ? k log f2 ? k ? ? f1 ?( f2f,2k ??kk )k + k log f1 + f2 ? k + f1 + f2 ? k f1,k + f2,k ? k k , which equals 0. The O( 1 ) term is 2 f1,k ? f1,k + f2,k ? k k + f1 + f2 ? k 2 f2,k ? k k k f2,k ? k f2 ? k + 2 f2 ? k 2 ? 2 f2 ? k 2 f1,k + f2,k ? k k k f1,k + f2,k ? k ? ? x2. ? 2 f1 + f2 ? k 2 + 2 f1 + f2 ? k ? The above equals (1 ? ?)2r2((1 ? ?)2r(?1 ? ?2) + ?1?2?)x2. 2??1?2?2 Noting that the change from k to k + y?n gives Therefore, we have y = k T =T x . k T =T = ? P(K = k n + y?n) P(K = k ) ?1?2? ? ?((1 ? ?)2r (?1 ? ?2) + ?1?2? ) y2(1 ? ?)2?1?2 . = 2?((1 ? ?)2r (?1 ? ?2) + ?1?2? ) y (1 ? ?)2r 2((1 ? ?)2r (?1 ? ?2) + ?1?2? ) 2??1?2? 2 2 Therefore, as n ? ?, the variance of K ??k n converges to n 2 ?K = 0, ?K2 . n?21 ? ?12 = ?n21 ? ?11 , k = 0, the above calculation is valid only for x < 0, and ?Kn converges to a truncated normal distribution. The density function is f ?Kn (k) = 2 Note that P(K = 0) ? n 2?K2 ? If k2 exp ? 2?K2 ? 0 as n ? ?. , ?k ? 0. n?21 ? ?12 < ?n21 ? ?11 , k = 0, the above calculation is no longer valid because the coefficient of the x ?n term is nonzero. From Theorem 6 we know that K converges to a geometric distribution when n ? ?. References In summary, when ? ? 1, P (I1 = 0, I2 = 0) is negligible compared with P (I1 = 0, I2 > 0) when n ? ? . We have m1 + m2 m1+m2 ?1 m1 ?2 m2 B1 2?m1n1n2 1/2 = n5?1?2 (m1 + m2)(n1 ? m1)(n2 ? m2)m2 n1 n1 n2 n2 m1 + m2 m1 m1 + m2 m2 ? n1 ? m1 n2 ? m2 m1 m2 ?1(n1 ? m1) m1 ?2(n2 ? m2) m2 ? exp(?m1 ? m2) ? ? 1 . Therefore, 1 ?? 1 . Adan , I.J.B.F. , Weiss , G.: Exact FCFS matching rates for two infinite multi-type sequences . Oper. Res. 60 ( 2 ), 475 - 489 ( 2012 ) 2 . Adan , I.J.B.F. , Weiss , G.: A queue with skill based service under FCFS-ALIS: steady state , overloaded system, and behavior under abandonments . Stoch. Syst . 4 ( 1 ), 250 - 299 ( 2014 ) 3 . Adan , I. , Foley , R. , McDonald , D. : Exact asymptotics of the stationary distribution of a Markov chain: a production model . Queueing Syst . 62 ( 4 ), 311 - 344 ( 2009 ) 4 . Adan , I. , Boon , M. , Weiss , G.: A design heuristic for skill based parallel service systems . arXiv preprint arXiv:1603.01404 ( 2014 ) 5 . Adan , I. , Busic , A. , Mairesse , J. , Weiss , G.: Reversibility and further properties of FCFS infinite bipartite matching . arXiv preprint arXiv:1507.05939 ( 2015 ) 6 . Armony , M. , Ward , A.R. : Fair dynamic routing in large-scale heterogeneous-server systems . Oper. Res. 58 ( 3 ), 624 - 637 ( 2010 ) 7 . Bell , S.L. , Williams , R.J.: Dynamic scheduling of a system with two parallel servers in heavy traffic with resource pooling: asymptotic optimality of a threshold policy . Ann. Appl. Probab . 11 ( 3 ), 608 - 649 ( 2001 ) 8 . Caldentey , R. , Kaplan , E.H. , Weiss , G.: FCFS infinite bipartite matching of servers and customers . Adv. Appl. Probab . 41 ( 3 ), 695 - 730 ( 2009 ) 9 . Foss , S. , Chernova , N.: On the stability of a partially accessible multi-station queue with state-dependent routing. Queueing Syst . 29 ( 1 ), 55 - 73 ( 1998 ) 10 . Ghamami , S. , Ward , A.R. : Dynamic scheduling of a two-server parallel server system with complete Oper . Res. 38 ( 4 ), 761 - 824 ( 2013 ) 11 . Green , L. : A queueing system with general-use and limited-use servers . Oper. Res . 33 ( 1 ), 162 - 182 ( 1985 ) 12 . Gurvich , I. , Whitt , W. : Queue-and-idleness-ratio controls in many-server service systems . Math. Oper. Res. 34 ( 2 ), 363 - 396 ( 2009 ) 13 . Gurvich , I. , Whitt , W. : Service-level differentiation in many-server service system via queue-ratio routing. Oper. Res . 58 ( 2 ), 316 - 328 ( 2010 ) 14 . Harchol-Balter , M. , Crovella , M.E. , Murta , C.D.: On choosing a task assignment policy for a distributed server system . J. Parallel Distrib. Comput . 59 ( 2 ), 204 - 228 ( 1999 ) 15 . Harrison , J.M. , Lopez , M.J.: Heavy traffic resource pooling in parallel-server systems . Queueing Syst. 33 ( 4 ), 339 - 368 ( 1999 ) 16 . Nov , Y. , Weiss , G , Zhang, H.: Fluid models of parallel service systems under FCFS . arXiv preprint arXiv:1604.04497 ( 2016 ) 17 . Rubino , M. , Ata , B. : Dynamic control of a make-to-order, parallel-server system with cancellations . Oper . Res. 57 ( 1 ), 94 - 108 ( 2009 ) 18 . Shanthikumar , J.G. , Yao , D.D.: Comparing ordered-entry queues with heterogeneous servers . Queueing Syst. 2 ( 3 ), 235 - 244 ( 1987 ) 19 . Tezcan , T. , Dai , J.G. : Dynamic control of N-systems with many servers: asymptotic optimality of a static priority policy in heavy traffic . Oper. Res . 58 ( 1 ), 94 - 110 ( 2010 ) 20 . Tezcan , T. : Stability analysis of N-model systems under a static priority rule . Queueing Syst . 73 ( 3 ), 235- 259 ( 2013 ) 21 . Visschers , J. , Adan , I.J.B.F. , Weiss , G.: A product form solution to a system with multi-type customers and multi-type servers . Queueing Syst . 70 ( 3 ), 269 - 298 ( 2012 ) 22 . Ward , A.R. , Armony , M. : Blind fair routing in large-scale service systems with heterogeneous cus- tomers and servers . Oper. Res . 61 ( 1 ), 228 - 243 ( 2013 ) 23 . Williams , R.J. : On dynamic scheduling of a parallel server system with complete resource pooling . Fields Inst. Commun . 28 , 49 - 71 ( 2000 )


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs11134-017-9549-7.pdf

Dongyuan Zhan, Gideon Weiss. Many-server scaling of the N-system under FCFS–ALIS, Queueing Systems, 2017, 27-71, DOI: 10.1007/s11134-017-9549-7