java.net.SocketException 问题
2008-03-28 10:46
519 查看
java.net.BindException: Address already in use: connect的问题
大概原因是短时间内new socket操作很多,而socket.close()操作并不能立即释放绑定的端口,而是把端口设置为TIME_WAIT状态,过段时间(默认240s)才释放,(用netstat -na可以看到),最后系统资源耗尽(windows上是耗尽了pool of ephemeral ports ,这段区间在1024-5000之间; ) 避免出现这一问题的方法有两个,一个是调高你的web服务器的最大连接线程数,调到1024,2048都还凑合,以resin为例,修改resin.conf中的thread-pool.thread_max,如果你采用apache连resin的架构,别忘了再调整apache; 另一个是修改运行web服务器的机器的操作系统网络配置,把time wait的时间调低一些,比如30s。 在red hat上,查看有关的选项, [xxx@xxx~]$ /sbin/sysctl -a|grep net.ipv4.tcp_tw net.ipv4.tcp_tw_reuse = 0 net.ipv4.tcp_tw_recycle = 0 [xxx@xxx~]$vi /etc/sysctl,修改 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 [xxx@xxx~]$sysctl -p,使内核参数生效 socket-faq中的这一段讲time_wait的,摘录如下: 2.7. Please explain the TIME_WAIT state. Remember that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered. Andrew Gierth (andrew@erlenstar.demon.co.uk) helped to explain the closing sequence in the following usenet posting: Assume that a connection is in ESTABLISHED state, and the client is about to do an orderly release. The client's sequence no. is Sc, and the server's is Ss. Client Server ====== ====== ESTABLISHED ESTABLISHED (client closes) ESTABLISHED ESTABLISHED ------->> FIN_WAIT_1 <<-------- FIN_WAIT_2 CLOSE_WAIT <<-------- (server closes) LAST_ACK , ------->> TIME_WAIT CLOSED (2*msl elapses...) CLOSED Note: the +1 on the sequence numbers is because the FIN counts as one byte of data. (The above diagram is equivalent to fig. 13 from RFC 793). Now consider what happens if the last of those packets is dropped in the network. The client has done with the connection; it has no more data or control info to send, and never will have. But the server does not know whether the client received all the data correctly; that's what the last ACK segment is for. Now the server may or may not care whether the client got the data, but that is not an issue for TCP; TCP is a reliable rotocol, and must distinguish between an orderly connection close where all data is transferred, and a connection abort where data may or may not have been lost. So, if that last packet is dropped, the server will retransmit it (it is, after all, an unacknowledged segment) and will expect to see a suitable ACK segment in reply. If the client went straight to CLOSED, the only possible response to that retransmit would be a RST, which would indicate to the server that data had been lost, when in fact it had not been. (Bear in mind that the server's FIN segment may, additionally, contain data.) DISCLAIMER: This is my interpretation of the RFCs (I have read all the TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though. More commentarty from Vic: The second issue was addressed by Richard Stevens (rstevens@noao.edu, author of "Unix Network Programming", see ``1.5 Where can I get source code for the book [book title]?''). I have put together quotes from some of his postings and email which explain this. I have brought together paragraphs from different postings, and have made as few changes as possible. From Richard Stevens (rstevens@noao.edu): If the duration of the TIME_WAIT state were just to handle TCP's full- duplex close, then the time would be much smaller, and it would be some function of the current RTO (retransmission timeout), not the MSL (the packet lifetime). A couple of points about the TIME_WAIT state. o The end that sends the first FIN goes into the TIME_WAIT state, because that is the end that sends the final ACK. If the other end's FIN is lost, or if the final ACK is lost, having the end that sends the first FIN maintain state about the connection guarantees that it has enough information to retransmit the final ACK. o Realize that TCP sequence numbers wrap around after 2**32 bytes have been transferred. Assume a connection between A.1500 (host A, port 1500) and B.2000. During the connection one segment is lost and retransmitted. But the segment is not really lost, it is held by some intermediate router and then re-injected into the network. (This is called a "wandering duplicate".) But in the time between the packet being lost & retransmitted, and then reappearing, the connection is closed (without any problems) and then another connection is established between the same host, same port (that is, A.1500 and B.2000; this is called another "incarnation" of the connection). But the sequence numbers chosen for the new incarnation just happen to overlap with the sequence number of the wandering duplicate that is about to reappear. (This is indeed possible, given the way sequence numbers are chosen for TCP connections.) Bingo, you are about to deliver the data from the wandering duplicate (the previous incarnation of the connection) to the new incarnation of the connection. To avoid this, you do not allow the same incarnation of the connection to be reestablished until the TIME_WAIT state terminates. Even the TIME_WAIT state doesn't complete solve the second problem, given what is called TIME_WAIT assassination. RFC 1337 has more details. o The reason that the duration of the TIME_WAIT state is 2*MSL is that the maximum amount of time a packet can wander around a network is assumed to be MSL seconds. The factor of 2 is for the round-trip. The recommended value for MSL is 120 seconds, but Berkeley-derived implementations normally use 30 seconds instead. This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x does indeed use the recommended MSL of 120 seconds. A wandering duplicate is a packet that appeared to be lost and was retransmitted. But it wasn't really lost ... some router had problems, held on to the packet for a while (order of seconds, could be a minute if the TTL is large enough) and then re-injects the packet back into the network. But by the time it reappears, the application that sent it originally has already retransmitted the data contained in that packet. Because of these potential problems with TIME_WAIT assassinations, one should not avoid the TIME_WAIT state by setting the SO_LINGER option to send an RST instead of the normal TCP connection termination (FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it's your friend and it's there to help you :-) I have a long discussion of just this topic in my just-released "TCP/IP Illustrated, Volume 3". The TIME_WAIT state is indeed, one of the most misunderstood features of TCP. I'm currently rewriting "Unix Network Programming" (see ``1.5 Where can I get source code for the book [book title]?''). and will include lots more on this topic, as it is often confusing and misunderstood. An additional note from Andrew: Closing a socket: if SO_LINGER has not been called on a socket, then close() is not supposed to discard data. This is true on SVR4.2 (and, apparently, on all non-SVR4 systems) but apparently not on SVR4; the |
相关文章推荐
- Android出现java.net.SocketException: Permission denied的问题
- JMeter测试问题java.net.SocketTimeoutException: connect timed out,Read timed out
- 【生产问题直播】java.net.SocketTimeoutException: Read timed out
- 解决Tomcat端口被占用的问题(java.net.SocketException: Unrecognized Windows Sockets error: 0: JVM_Bind)
- java.net.SocketException: Too many open files 问题的解决办法
- 网络编程之java.net.SocketException: Connection reset异常问题(一)
- java.net.SocketException: No buffer space available (maximum connections reached?): connect 问题分析
- java.net.SocketException: Too many open files 问题的解决办法
- Hadoop问题:java.net.SocketException: Network is unreachable
- java.net.SocketException: Broken pipe问题解决
- weblogic 出现 java.net.SocketException 异常可能是一个页面的小问题导致的。
- Android出现java.net.SocketException: Permission denied的问题
- java.net.SocketException: Connection reset问题
- java.net.SocketException: Too many open files问题分析及解决方案
- 有关java.net.SocketException: No buffer space available的问题
- resin WED服务器初用遇到的问题和解决方法 java.lang.RuntimeException: java.net.SocketException: Unrecognized Windows Socke ts error: 0: JVM_Bind
- java.net.SocketException: Too many open files 问题的解决办法
- java.net.SocketException: Too many open files 问题的解决办法
- java.net.SocketException: Too many open files 问题的解决
- Connection reset问题,INFO: I/O exception (java.net.SocketException) caught when processing reques