IT박스

Java 가비지 콜렉션은 순환 참조와 어떻게 작동합니까?

itboxs 2020. 6. 15. 22:00
반응형

Java 가비지 콜렉션은 순환 참조와 어떻게 작동합니까?


내 이해에서 Java의 가비지 수집은 해당 객체를 가리키는 다른 것이 없으면 일부 객체를 정리합니다.

내 질문은, 우리가 이와 같은 것을 가지고 있다면 어떻게 될까요?

class Node {
    public object value;
    public Node next;
    public Node(object o, Node n) { value = 0; next = n;}
}

//...some code
{
    Node a = new Node("a", null), 
         b = new Node("b", a), 
         c = new Node("c", b);
    a.next = c;
} //end of scope
//...other code

a, bc쓰레기 수집해야하지만, 그들은 다른 모든 객체에 의해 참조되고있다.

Java 가비지 콜렉션은이를 어떻게 처리합니까? (또는 단순히 메모리 소모입니까?)


Java의 GC는 가비지 콜렉션 루트에서 시작하는 체인을 통해 도달 할 수없는 경우 "가비지"로 간주되므로 이러한 오브젝트가 수집됩니다. 주기를 형성하기 위해 객체가 서로를 가리킬 수도 있지만 루트에서 잘라 내면 여전히 쓰레기입니다.

부록 A : Java 플랫폼 성능의 가비지 콜렉션에 대한 진실 : 전략 및 전술 에서 도달 할 수없는 오브젝트에 대한 섹션을 참조 하십시오.


예 Java 가비지 콜렉터가 순환 참조를 처리합니다!

How?

가비지 수집 루트 (GC 루트)라는 특수 객체가 있습니다. 이것들은 항상 도달 할 수 있으며 자체 루트에있는 모든 객체입니다.

간단한 Java 애플리케이션에는 다음과 같은 GC 루트가 있습니다.

  1. 주요 방법의 지역 변수
  2. 주요 실
  3. 메인 클래스의 정적 변수

여기에 이미지 설명을 입력하십시오

더 이상 사용하지 않는 오브젝트를 판별하기 위해 JVM은 mark-and-sweep 알고리즘 이라고하는 것을 간헐적으로 실행 합니다 . 다음과 같이 작동합니다

  1. 이 알고리즘은 GC 루트부터 시작하여 모든 객체 참조를 통과하고 발견 된 모든 객체를 살아있는 것으로 표시합니다.
  2. 표시된 오브젝트가 차지하지 않는 모든 힙 메모리가 재 확보됩니다. 단순히 사용되지 않는 것으로 표시되어 있으며 기본적으로 사용되지 않는 객체가 없습니다.

따라서 GC 루트에서 객체에 도달 할 수없는 경우 (자체 참조 또는 순환 참조 인 경우에도) 가비지 수집 대상이됩니다.

물론 프로그래머가 객체의 역 참조를 잊어 버린 경우 메모리 누수가 발생할 수 있습니다.

여기에 이미지 설명을 입력하십시오

출처 : 자바 메모리 관리


가비지 수집기는 CPU 레지스터, 스택 및 전역 변수와 같이 항상 "연결 가능한"것으로 간주되는 일부 "루트"위치에서 시작합니다. 해당 영역에서 포인터를 찾고 그들이 가리키는 모든 것을 재귀 적으로 찾아서 작동합니다. 모든 것이 발견되면 다른 모든 것은 쓰레기입니다.

There are, of course, quite a few variations, mostly for the sake of speed. For example, most modern garbage collectors are "generational", meaning that they divide objects into generations, and as an object gets older, the garbage collector goes longer and longer between times that it tries to figure out whether that object is still valid or not -- it just starts to assume that if it has lived a long time, chances are pretty good that it'll continue to live even longer.

Nonetheless, the basic idea remains the same: it's all based on starting from some root set of things that it takes for granted could still be used, and then chasing all the pointers to find what else could be in use.

Interesting aside: may people are often surprised by the degree of similarity between this part of a garbage collector and code for marshaling objects for things like remote procedure calls. In each case, you're starting from some root set of objects, and chasing pointers to find all the other objects those refer to...


You are correct. The specific form of garbage collection you describe is called "reference counting". The way it works (conceptually, at least, most modern implementations of reference counting are actually implemented quite differently) in the simplest case, looks like this:

  • whenever a reference to an object is added (e.g. it is assigned to a variable or a field, passed to method, and so on), its reference count is increased by 1
  • whenever a reference to an object is removed (the method returns, the variable goes out of scope, the field is re-assigned to a different object or the object which contains the field gets itself garbage collected), the reference count is decreased by 1
  • as soon as the reference count hits 0, there is no more reference to the object, which means nobody can use it anymore, therefore it is garbage and can be collected

And this simple strategy has exactly the problem you decribe: if A references B and B references A, then both of their reference counts can never be less than 1, which means they will never get collected.

There are four ways to deal with this problem:

  1. Ignore it. If you have enough memory, your cycles are small and infrequent and your runtime is short, maybe you can get away with simply not collecting cycles. Think of a shell script interpreter: shell scripts typically only run for a few seconds and don't allocate much memory.
  2. Combine your reference counting garbage collector with another garbage collector which doesn't have problems with cycles. CPython does this, for example: the main garbage collector in CPython is a reference counting collector, but from time to time a tracing garbage collector is run to collect the cycles.
  3. Detect the cycles. Unfortunately, detecting cycles in a graph is a rather expensive operation. In particular, it requires pretty much the same overhead that a tracing collector would, so you could just as well use one of those.
  4. Don't implement the algorithm in the naive way you and I would: since the 1970s, there have been multiple quite interesting algorithms developed that combine cycle detection and reference counting in a single operation in a clever way that is significantly cheaper than either doing them both seperately or doing a tracing collector.

By the way, the other major way to implement a garbage collector (and I have already hinted at that a couple of times above), is tracing. A tracing collector is based on the concept of reachability. You start out with some root set that you know is always reachable (global constants, for example, or the Object class, the current lexical scope, the current stack frame) and from there you trace all objects that are reachable from the root set, then all objects that are reachable from the objects reachable from the root set and so on, until you have the transitive closure. Everything that is not in that closure is garbage.

Since a cycle is only reachable within itself, but not reachable from the root set, it will be collected.


The Java GCs don't actually behave as you describe. It's more accurate to say that they start from a base set of objects, frequently called "GC roots", and will collect any object that can not be reached from a root.
GC roots include things like:

  • static variables
  • local variables (including all applicable 'this' references) currently in the stack of a running thread

So, in your case, once the local variables a, b, and c go out of scope at the end of your method, there are no more GC roots that contain, directly or indirectly, a reference to any of your three nodes, and they'll be eligible for garbage collection.

TofuBeer's link has more detail if you want it.


This article (no longer available) goes into depth about the garbage collector (conceptually... there are several implementations). The relevant part to your post is "A.3.4 Unreachable":

A.3.4 Unreachable An object enters an unreachable state when no more strong references to it exist. When an object is unreachable, it is a candidate for collection. Note the wording: Just because an object is a candidate for collection doesn't mean it will be immediately collected. The JVM is free to delay collection until there is an immediate need for the memory being consumed by the object.


Garbage collection doesn't usually mean "clean some object iff nothing else is 'pointing' to that object" (that's reference counting). Garbage collection roughly means finding objects that can't be reached from the program.

So in your example, after a,b, and c go out of scope, they can be collected by the GC, since you can't access these objects anymore.


Bill answered your question directly. As Amnon said, your definition of garbage collection is just reference counting. I just wanted to add that even very simple algorithms like mark and sweep and copy collection easily handle circular references. So, nothing magic about it!

참고URL : https://stackoverflow.com/questions/1910194/how-does-java-garbage-collection-work-with-circular-references

반응형