자바 중급 2편

1. List vs Set

구분	List	Set
순서	요소 추가 순서 유지	순서 보장 X
중복 허용 여부	허용	불허
접근 방식	인덱스로 접근 가능	인덱스 없음, 값 존재 여부로 접근
대표 용도	순서가 중요한 데이터, 중복 허용	고유한 값만 저장, 빠른 검색
예시	장바구니 목록, 이력 기록	회원 ID, 고유 태그 목록

2. Set 직접 구현 (기본형)

package com.example.collection.set;

import java.util.Arrays;

public class MyHashSetV0 {

    private int[] elementData = new int[10];
    private int size = 0;

    //O(n)
    public boolean add(int value) {

        if (contains(value)) {
            return false;
        }

        elementData[size++] = value;

        return true;

    }

    public int size() {
        return size;
    }

    // O(n)
    public boolean contains(int value) {

        for (int data : elementData) {

            if (data == value) {
                return true;
            }

        }

        return false;

    }

    @Override
    public String toString() {
        return "MyHashSetV0{" +
                "elementData=" + Arrays.toString(Arrays.copyOf(elementData, size)) +
                ", size=" + size +
                '}';
    }
}

특징

단순 배열 기반 저장
추가/검색 모두 O(n) → 데이터 많아지면 성능 저하

3. 해시 알고리즘의 핵심 아이디어

💡 배열은 인덱스로 접근할 때 O(1)로 빠르다
→ 데이터를 인덱스와 매칭시켜 저장하면 검색 속도를 비약적으로 향상 가능

예시)

값 8을 찾기 → array[8] 로 즉시 접근
검색 성능: O(n) → O(1) 로 개선

package com.example.collection.set;

import java.util.Arrays;

public class HashStart2 {

    public static void main(String[] args) {

        Integer[] inputArray = new Integer[10];

        inputArray[1] = 1;

        inputArray[2] = 2;

        inputArray[5] = 5;

        inputArray[8] = 8;

        System.out.println("inputArray = " + Arrays.toString(inputArray));


        int searchValue = 8;

        System.out.println("searchValue = " + inputArray[searchValue]);

    }
}

단점
- 값의 범위가 클 경우(예: int 전체 범위) → 배열 크기 = 값의 최대치 필요
- 메모리 소모와 초기화 비용이 엄청남
  → 값이 몇 개 안 들어올 때 심각한 낭비 발생

5. 나머지 연산(Hashing)으로 해결

package com.example.collection.set;

import java.util.Arrays;

public class HashStart4 {

    static final int CAPACITY = 10;

    public static void main(String[] args) {


        System.out.println("hashIndex(1) = " + hashIndex(1));
        System.out.println("hashIndex(2) = " + hashIndex(2));
        System.out.println("hashIndex(5) = " + hashIndex(5));
        System.out.println("hashIndex(8) = " + hashIndex(8));
        System.out.println("hashIndex(14) = " + hashIndex(14));
        System.out.println("hashIndex(99) = " + hashIndex(99));


        Integer[] inputArray = new Integer[CAPACITY];

        add(inputArray, 1);
        add(inputArray, 2);
        add(inputArray, 5);
        add(inputArray, 8);
        add(inputArray, 14);
        add(inputArray, 99);


        int searchValue = 99;

        int hashIndex = hashIndex(searchValue);

        System.out.println("hashIndex = " + hashIndex);

        System.out.println("searchValue = " + inputArray[hashIndex]);

        System.out.println(Arrays.toString(inputArray));
    }


    static void add(Integer[] inputArray, int value) {

        int hashIndex = hashIndex(value);

        inputArray[hashIndex] = value;


    }

    static int hashIndex(int value) {
        return value % CAPACITY;
    }
}

CAPACITY = 배열 크기
나머지 연산 결과는 항상 0 ~ CAPACITY-1 범위
값 범위가 커도 배열 크기를 제한 가능
저장/검색 모두 O(1) 평균 성능

6. 해시 충돌(Hash Collision)

1 % 10 = 1
11 % 10 = 1  // 같은 인덱스 → 덮어쓰기 위험

Separate Chaining (분리 연결법)

각 인덱스에 LinkedList 또는 다른 자료구조 저장
같은 해시 인덱스를 가진 값들은 해당 리스트에 모두 저장
검색 시 리스트 내부 순회

package com.example.collection.set;

import java.util.Arrays;
import java.util.LinkedList;

public class HashStart5 {

    static final int CAPACITY = 10;

    public static void main(String[] args) {
        // {1, 2, 5, 8, 14, 99}

        LinkedList<Integer>[] buckets = new LinkedList[CAPACITY];

        System.out.println("buckets = " + Arrays.toString(buckets));

        for (int i = 0; i < CAPACITY; i++) {

            buckets[i] = new LinkedList<>();


        }

        System.out.println("buckets = " + Arrays.toString(buckets));

        add(buckets, 1);
        add(buckets, 2);
        add(buckets, 5);
        add(buckets, 8);
        add(buckets, 14);
        add(buckets, 99);
        add(buckets, 9);
        System.out.println("buckets = " + Arrays.toString(buckets));

        // 검색
        int searchValue = 9;

        if (contains(buckets, searchValue)) {

            System.out.println("searchValue = " + searchValue);
        }

    }

    private static void add(LinkedList<Integer>[] buckets, Integer value) {

        int hashIndex = hashIndex(value);

        LinkedList<Integer> bucket = buckets[hashIndex];

        if (!bucket.contains(value)) {

            bucket.add(value);

        }

    }

    private static boolean contains(LinkedList<Integer>[] buckets, Integer value) {
        int hashIndex = hashIndex(value);

        LinkedList<Integer> bucket = buckets[hashIndex];


        return bucket.contains(value);


    }



    static int hashIndex(int value) {
        return value % CAPACITY;
    }




}

실행결과

buckets = [[], [], [], [], [], [], [], [], [], []]
buckets = [[], [1], [2], [], [14], [5], [], [], [8], [99, 9]]
searchValue = 9

📌 충돌 확률 줄이기

배열 크기(CAPACITY) ↑ → 메모리 낭비 ↑
보통 저장 데이터 수 ≤ 배열 크기 × 0.75 유지 권장

출처: 인프런 자바 중급 2편 : 김영한

'Java' 카테고리의 다른 글

자바 중급 2편 - HashSet(2) (3)	2025.08.17
자바 중급 2편 - HashSet(1) (2)	2025.08.15
자바 중급 2편 - List(3) (3)	2025.08.10
자바 중급 2편 - List(2) (1)	2025.08.10
자바 중급 2편 - List(1) (1)	2025.08.10

JiHoons

자바 중급 2편 - Hash

1. List vs Set

2. Set 직접 구현 (기본형)

3. 해시 알고리즘의 핵심 아이디어

5. 나머지 연산(Hashing)으로 해결

6. 해시 충돌(Hash Collision)

'Java' 카테고리의 다른 글

티스토리툴바

자바 중급 2편 - Hash

1. List vs Set

2. Set 직접 구현 (기본형)

3. 해시 알고리즘의 핵심 아이디어

5. 나머지 연산(Hashing)으로 해결

6. 해시 충돌(Hash Collision)

'Java' 카테고리의 다른 글

관련글

티스토리툴바