아토믹 (Atomic, cpp17)

언어/C++

아토믹 (Atomic, cpp17)

tsyang 2022. 7. 3. 01:28

왜 필요?

#include <iostream>
#include <thread>
#include <vector>
using namespace std;

void add (int & num)
{
	for(int i=0;i <1000000;++i)
		++num;
}

int main()
{
	int num = 0;
	vector<thread> threads;

	for (int i = 0; i < 4; ++i)
		threads.emplace_back(add, std::ref(num));

	for (auto & thread : threads)
		thread.join();

	cout << num << endl;
}

위의 코드는 4개의 스레드에게 변수 num을 1씩 더하는 과정을 100만번 수행하도록 하고 있다.

뭐.. 400만이라는 숫자가 출력되어야 할 것 같지만

당연하게도 데이터 레이스 때문에 이상한 숫자가 출력된다.

아~ 그러면 뮤텍스를 써서 스레드 동기화를 해보자! 라는 생각이 든다.

#include <iostream>
#include <thread>
#include <vector>
#include <ctime>
#include <mutex>

using namespace std;

mutex mtx;

void add (int & num)
{
	for(int i=0;i <1000000;++i)
	{
		lock_guard<mutex> lock{ mtx };
		++num;
	}
}

int main()
{
	int num = 0;
	vector<thread> threads;

	auto start = clock();
	
	for (int i = 0; i < 4; ++i)
		threads.emplace_back(add, std::ref(num));

	for (auto & thread : threads)
		thread.join();

	auto end = clock();

	cout << num << endl;
	cout << "time : " << (end - start) << "ms" << endl;
}

400만으로 결과가 잘 나오긴 하는데 시간이 1636ms나 걸렸다.

여기서 아토믹을 쓰면 속도를 개선할 수 있다. 다음 코드에선 변수 num을 atomic<int>로 선언하였다.

#include <iostream>
#include <atomic>
#include <thread>
#include <vector>
#include <ctime>

using namespace std;

void add (atomic<int> & num)
{
	for(int i=0;i <1000000;++i)
	{
		++num;
	}
}

int main()
{
	atomic<int> num{ 0 };	//0은 초기화 값
	vector<thread> threads;

	auto start = clock();
	
	for (int i = 0; i < 4; ++i)
		threads.emplace_back(add, std::ref(num));

	for (auto & thread : threads)
		thread.join();

	auto end = clock();

	cout << num << endl;
	cout << "time : " << (end - start) << "ms" << endl;
}

결과도 잘 나오고 속도도 훨씬 빠르다

atomic<T> 는 어떤 타입을 지원?

그렇다면 아토믹은 아무런 타입이나 다 지원해줄까? 아니다.

대충 다음을 만족하면 사용 가능하다.

1. 메모리 안에서 연속된 청크임

2. 오브젝트를 복사하는 것이 모든 bit을 복사하는 것과 같음 (memcpy)

3. 가상함수나 noexcept 생성자가 없음

그러니까 int, long, char, int*같은 타입은 당연히 가능하고 몇몇 클래스나 구조체도 가능하다는 것이다.

헷갈린다면 is_lock_free() 메서드를 통해 확인해볼 수 있다.

#include <atomic>
#include <iostream>

using namespace  std;

struct SomeStruct1
{
	int a, b, c;
};

struct SomeStruct2
{
	int a[100];
	int b;
};

int main()
{
	atomic<SomeStruct1> ss1;
	cout << ss1.is_lock_free() << endl;	// 0 (false) 출력

	atomic<SomeStruct2> ss2;
	cout << ss2.is_lock_free() << endl;	// 0 (false) 출력

	atomic<int**> doubleptr;
	cout << doubleptr.is_lock_free() << endl;// 1 (true) 출력
	
	return 0;
}

어떤 연산을 지원하나?

대충 다음과 같이 일반 변수가 쓰는 연산들을 그대로 지원한다.

atomic<int> x{ 0 };
int y;

//동일한 코드
x.store(10);
x = 10;

//동일한 코드
y = x.load();
y = x;

//정수 타입 한정, 동일한 코드
x.fetch_add(y);
x += y;

++x;
x++;

x |= 2; //bit set

// x*=2; 곱셈은 지원 안 함 (컴파일에러)

그러나 다음과 같은 식들은 atomic에서는 조금 주의해야 한다.

atomic<int> x{0};

int y = x * 2; //atomic read (x)
x = y + 1; // atomic write (x)

x = x + 1; // x+=1과 다름!! atomic read and write임
x = x * 2; //이런 곱셈은 가능. 왜냐 read/write 이기 때문

더 복잡한 식들도 지원한다.

atomic<int> x{ 0 };

int y = x.exchange(10);
// y = x, x = 10; 과 같은 의미

bool result = x.compare_exchange_strong(y, 10);
// compare_exchange_strong 은 다음과 같은 의미이다.
//if(x==y)
//{
//	x = 10;
//	return true;
//}
//else
//{
//	y = 10;
//	return false;
//}


//weak한 compare_exchange는 strong보다 일반적으로 빠르지만 실패할 수 있다.
//그래서 다음과 같이 계속 시도하는 방법을 쓸 수 있다.
while(!x.compare_exchange_weak(y, 10))
{
	
}

왜 빠름?

그렇다면 atomic은 왜 빠를까?

atomic과 유사한 개념이 c#에서는 interlocked인거같은데 ..

https://tsyang.tistory.com/107

단순동기화2 - 유저 모드 동기화

2022.05.01 - [언어/C#] - CLR 단순 동기화1 CLR 단순 동기화1 스레드 동기화 스레드 동기화는 일반적으로 다수의 스레드가 공유 데이터에 '동시에' 접근하는 경우에도 데이터가 손상되는 것을 막기 위

tsyang.tistory.com

즉 mutex는 커널모드 동기화 요소이고, atomic은 유저 모드 동기화 요소 (CPU가 해주는 것)여서 빠르다고 할 수 있다.

CPU가 뭘해준다는 거?

https://stackoverflow.com/questions/18640327/how-does-interlocked-work-and-why-is-it-faster-than-lock

위 글을 보면 Interlocked의 메서드들은 CPU 레벨에서 지원하는 연산(Instruction)이기 때문에 빠르다는 것이다. CPU는 내부적으로 이러한 연산을 사용할 때 변수에 락을 걸어 동시에 하나의 스레드만 접근할 수 있게 해준다.

반면에 뮤텍스는 커널단에서 일어나는 것이기에 느릴 수 밖에 없는 것.

'언어 > C++' 카테고리의 다른 글

아토믹으로 Lock-Free 자료구조 만들기 (0)	2022.07.10
Cpp 함수 (C++11 lambda, std::function) (2)	2021.07.31
Cpp - 상속#2 (2)	2021.07.22
Cpp - 상속#1 (0)	2021.07.18
C++ (복사/이동) 생성자, 할당자, Rule of Three(Five) (0)	2020.11.08

현재글아토믹 (Atomic, cpp17)

C#, 아키텍쳐, 클린코드, C++, 동기화, 스레딩, 타입, IL2CPP, dots, 메모리, 유니티, CPP, GameAI, 추가 예정, 1.1, AI, 대수학, clr, 그래픽스, 디자인패턴,

Today :
Yesterday :

게임 클라 개발