[LINUX] What is a C language library? What is the information that is open to the public?

** Introduction **

The subject of this article is as follows.

  1. What is the information about the library that is open to the outside? Rough commentary
  2. How to check public information
  3. How to limit public information

The content is a deep dive into this article. I will summarize again what the library is and what the public API is.

** What is the library information published to the outside? ** **

** First of all, what is a library? ** **

The program starts with the main function and realizes the functions of the program by using the functions called by main and the data used. Since it will not work without the functions to be called from the main function and the data used, there will always be an entity somewhere in the program.

program.png

When you create a program in this way, there are cases where you suddenly want to cut corners. You have to use the exact same function every time. The library ** is a mechanism to put together such ** exactly the same functions.

ライブラリ.png

** To use the library **

The library associates with a library called ** library link ** or ** link ** when the program is compiled.

In order to link the library, the library must do the following:

-** Make the functions and data required by other programs available to the library **

You have to disclose information about ʻapllo ()` and cups that are used in the figure. This is the main theme of this time, ** Information that is open to the public **. If the function information is resolved by this linking, the program can use the library. At this time, the usage method differs depending on the type of library.

** Static library (.a) **

It is a library in a format that the program imports as it is at compile time. Since it is included in the program at build time, the function already exists when the program is executed, so you can use the program without thinking **. Since the entire half-sided library is imported, ** the size will increase **.

static_lib.png

** Shared library (.so) **

At compile time, the program remembers only the library information and associates it with the target file at run time. (It is called ** linking libraries **)

shared_lib.png

In the case of a shared library, the program only has the library information, so it is necessary to ** make the target library linkable ** when the program executes.

This mechanism is [like this](https://qiita.com/developer-kikikaikai/items/f6f87b2d1d7c3e14fb52#%E5%9F%BA%E6%9C%AC%E7%9A%84%E3% 81% AAlinux% E3% 81% AE% E5% 8B% 95% E7% 9A% 84% E3% 83% A9% E3% 82% A4% E3% 83% 96% E3% 83% A9% E3% 83% AA% E3% 83% AA% E3% 83% B3% E3% 82% AF% E3% 81% AB% E9% 96% A2% E3% 81% 99% E3% 82% 8B% E4% BB% 95% E7% B5% 84% E3% 81% BF).

Simply put, it's OK if the target library path is output by the ldd command. If not, you need to pass the path in the build or environment variable for the library. It is OK if the right side of => is tied somewhere as shown below.

$ ldd /usr/bin/curl
        linux-vdso.so.1 (0x00007ffe307ac000)
        libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f85cce4e000)
...
        libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f85cbb4b000)
        libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f85cb6d3000)
...
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f85c6c95000)

** Supplement **

2018/05/16 postscript

~~ I used the article by mistake, and ~~ the official name of the library. I would like to take this opportunity to summarize.

name Example Overview
Static library Linux xx.a, Windows xxx.lib Library in the format incorporated into the program
Shared library Linux xx.so, Windows xxx.dll Library linked at program startup
Dynamic library Linux dlopen(), Windows LoadLibrary() Library to link while the program is running

xx.a, It feels like a dynamic library because it is fluidly incorporated into various programs. I often make a mistake if I don't organize my brain like "Dynamic ... Oh, it's a name that comes from a different way of being taken in." ... maybe just myself. Yes

** Information on libraries exposed to the outside External linkage **

I made a long move. From here is the production. I said something, but what is the information that is disclosed to the outside? In program terms, it is ** with external linkage **. (Dare to use the phrase ** public information in this article **)

To put it very roughly in C language, ** the basics are functions and variables ** that are not defined statically. Note that the way of thinking is slightly different between C and C ++. ** If the class is not static, C ++ may not be able to handle internal variables well even if the method is static **. That's right. Others ** It is necessary to understand the difference between static and static which is not public information in C ++ **. For more information on C ++, please refer to here. This article will focus on C.

The problem with this condition alone is that ** All functions used across files in the library become public information! **. So, after that, we will show you how to check public information and how to limit it. It partially overlaps with this article.

** How to check public information (external linkage) **

** nm command **

You can check the public information with the nm command.

For example:

First, check the program itself.

$nm -D test
                 w __cxa_finalize
                 w __gmon_start__
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U __libc_start_main
                 U memcmp
                 U __printf_chk
                 U publisher_free
                 U publisher_new
                 U publisher_publish
                 U publisher_subscribe
                 U publisher_unsubscribe
                 U puts
                 U __stack_chk_fail

Lowercase letters are local functions, details are omitted. The problem is capital letters. ** U is unresolved **, that is, a function that needs to be linked at runtime as a shared library. Since publisher_XXX etc. are self-made functions, they cannot be used unless they are published.

On the other hand, the linked library looks like this

nm  .libs/libpublisher.so
...
0000000000000f90 t dputil_list_pop
0000000000000f50 t dputil_list_pull
0000000000000f20 t dputil_list_push
0000000000000ee0 t dputil_lock
0000000000000f00 t dputil_unlock
...
                 U pthread_mutex_init@@GLIBC_2.2.5
                 U pthread_mutex_lock@@GLIBC_2.2.5
                 U pthread_mutex_unlock@@GLIBC_2.2.5
                 U __pthread_register_cancel@@GLIBC_2.3.3
                 U __pthread_unregister_cancel@@GLIBC_2.3.3
                 w __pthread_unwind_next@@GLIBC_2.3.3
...
00000000000009b0 T publisher_free
0000000000202090 b publisher_g
0000000000000a00 T publisher_new
0000000000000b20 T publisher_publish
0000000000000a90 T publisher_subscribe
0000000000000ae0 T publisher_unsubscribe
0000000000000910 t register_tm_clones
                 U __sigsetjmp@@GLIBC_2.2.5
                 U __stack_chk_fail@@GLIBC_2.4
0000000000202078 d __TMC_END__

** T is public information **. There is a publisher_xxx properly. In other words, it is OK if libpublisher.so.0.0.0 is ready to be linked at runtime. pthread_mutex_init etc. is ** U **, but since this is a standard function, it will be linked normally.

By the way, it is included in dputil_xxx, libpublisher.so in the above local function, but it is actually a static library function.

$ nm .libs/libdputil.a

dp_util.o:
00000000000000b0 T dputil_list_pop
0000000000000070 T dputil_list_pull
0000000000000040 T dputil_list_push
0000000000000000 T dputil_lock
0000000000000020 T dputil_unlock
                 U _GLOBAL_OFFSET_TABLE_
                 U pthread_mutex_lock
                 U pthread_mutex_unlock

The fact that this is not ** U ** also means that libpublisher.so took it in at compile time.

** objdump command **

You can also dump the information held by the program with the ʻobjdump` command. Since you can check various information other than links, it seems to be useful for analysis if you can master it (= I think that it is close to reading assembler)

objdump -t libpublisher.so.0.0.0

libpublisher.so.0.0.0:     file format elf64-x86-64

SYMBOL TABLE:
...
0000000000000ee0 l     F .text  0000000000000012              dputil_lock
...
0000000000000000       F *UND*  0000000000000000              free@@GLIBC_2.2.5
...
0000000000000ae0 g     F .text  0000000000000032              publisher_unsubscribe
...

If ** g is attached like this, public information **, ** l is local **, ** UND is unresolved **. Please use the through direction for the meaning of detailed words.

** How to limit public information (external linkage) **

**--API restrictions by version-script **

Same as the previous article. ** Specify the function to be published in XXX.map and add -Wl, --version-script, libtimelog.map to the build options **.

LDFLAGS+=-Wl,--version-script,libtimelog.map

libtimelog.map


{
  global:
    timetestlog_init;
    timetestlog_store_printf;
    timetestlog_exit;
  local: *;
};

It's simple and easy to understand, but it's a hassle to create a configuration file.

-fvisibility=hidden

If you specify -fvisibility = hidden, all functions are first made private. Then, add __attribute __ ((visibility ("default "))) to only what you need and publish it! It is a technique called. It's a perfect way for those who like to control with coding rules.

In libpublisher.so used when introducing nm, -fvisibility = hidden specified, for each public function ʻint __attribute __ ((visibility (" default "))) publisher_new (size_t contents_num)` I specified and built it like this.

As a result, I'm curious that the API of libdputil.a is T, but I can limit it to a nice feeling.

nm -D .libs/libpublisher.so.0.0.0
0000000000202098 B __bss_start
                 U calloc
                 w __cxa_finalize
0000000000001210 T dputil_list_pop
00000000000011d0 T dputil_list_pull
00000000000011a0 T dputil_list_push
0000000000001160 T dputil_lock
0000000000001180 T dputil_unlock
0000000000202098 D _edata
00000000002020c0 B _end
0000000000001224 T _fini
                 U free
                 w __gmon_start__
0000000000000a00 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U pthread_mutex_init
                 U pthread_mutex_lock
                 U pthread_mutex_unlock
                 U __pthread_register_cancel
                 U __pthread_unregister_cancel
                 w __pthread_unwind_next
0000000000000c10 T publisher_free
0000000000000c80 T publisher_new
0000000000000da0 T publisher_publish
0000000000000d10 T publisher_subscribe
0000000000000d60 T publisher_unsubscribe
                 U __sigsetjmp
                 U __stack_chk_fail

** My taste **

I prefer ** --version-script ** because I don't have to be public / private when writing code. It's easy for me to write them all together in conf.

2018/05/20 postscript I also like the fact that this can be automated with a script. I made a script sample that searches the include directory and creates a version-script. The response to macros is not good, but I think it can be used as it is.

#!/bin/sh

output_conf_map() {
        #{
        #  global:
        #    function_name;
        # ...
        #  local: *;
        #}

        #only get function list from header file
        HEADER_FUNC_LIST=`grep "[a-zA-Z](" -r $1 | grep -v "@brief" | grep -v "#define" | awk -F"(" '{print $1}' | awk -F " " '{print $NF}'`
        #template
        echo "{"
        echo "  global:"

        #show all function
        for data in $HEADER_FUNC_LIST
        do
                echo "    $data;"
        done

        #template end
        echo "  local: *;"
        echo "};"
}

INCLUDE_LIST=`find . -name include`
for inc_dir in $INCLUDE_LIST
do
        echo $inc_dir
        output_conf_map  $inc_dir
done

** Impression **

First of all, from an apology. The reason for compiling this article is Article on library publication restrictions ) And commented, "Why are there various forms of public API restriction means for shared libraries? What is public in the first place? Why do people live?" It was the beginning. Therefore, at first, I aimed to answer the question accurately in the language of the program area.

However, it was impossible to talk about this area without digging deeply, and since the Zen question and answer really started, I thought again, "What was the library I wanted to organize?" Thanks to that, I feel that the fluffy part, whether I understand it as a middleware upper developer or not, has become clear.

After that, I have a lot of trouble around the library, but if I know how to check it, my anxiety will decrease. Especially around OSS. ldd, nm super convenient

reference

A site that explains how to handle memory very carefully. Kane: "Oh, this is the one I'm addicted to if I dig deeper" When asked for the code size|Things you can't tell at school| [Technical column collection]Built-in gate|Uquest Co., Ltd.

Handling of functions as seen from the assembler A story of being easily defeated before recent technology when trying to show old techniques | Possible Eria

Definition of external linkage https://msdn.microsoft.com/ja-jp/library/k8w8btzz.aspx

How to limit library sharing in C ++ How to write a shared library in C ++ format (gcc edition) --Qiita

How to read nm Display list of symbols from object with nm command --Qiita

Library name [Library-Basic knowledge of communication terms](http://www.wdic.org/w/TECH/ Library)

Recommended Posts

What is a C language library? What is the information that is open to the public?
[Introduction to Python] What is the most powerful programming language now?
[Introduction to Python] What is the difference between a list and a tuple?
I want to identify the alert email. --Is that x a wildcard? ---
What is the fastest way to create a reverse dictionary in python?
How to use the C library in Python
It's a Mac. What is the Linux command Linux?
How to display the modification date of a file in C language up to nanoseconds
How to use a library that is not originally included in Google App Engine
What is a rational decision that maximizes the chances of encountering an "ideal home"?
[Introduction to Python] What is the recommended way to install pip, a package management system?
How to limit the API to be published in the C language shared library of Linux
Try to make a Python module in C language
A quick introduction to the neural machine translation library
A programming language that protects the people from NHK
[Python] A convenient library that converts kanji to hiragana
I felt that I ported the Python code to C ++ 98.
What is a recommend engine? Summary of the types
To myself as a Django beginner (2) --What is MTV?
What is a distribution?
What is a terminal?
What is a hacker?
What is a pointer?
Ventilation is important. What I did to keep track of the C02 concentration in the room
What to do when a Missing artifact occurs in a jar that is not defined in pom.xml
What is the difference between a symbolic link and a hard link?
A memorandum to register the library written in Hy in PyPI
Generate a password that is easy to remember with apg
I tried to illustrate the time and time in C language
Is there a bias in the numbers that appear in the Fibonacci numbers?
[C language] Close () It is dangerous to retry when it fails
The story that the private key is set to 600 with chmod
[Pyro] Statistical modeling by the stochastic programming language Pyro ① ~ What is Pyro ~
Use a scripting language for a comfortable C ++ life-OpenCV-Port Python to C ++-
Pass OpenCV data from the original C ++ library to Python
Is there a secret to the frequency of pi numbers?
What to do if you cat or tail a binary file and the terminal is garbled
I tried to make a site that makes it easy to see the update information of Azure
Hypothesis / Verification (176) How to make a textbook that is easier than "The easiest textbook for quantum computers"
What to do when a part of the background image becomes transparent when the transparent image is combined with Pillow
A story that is a little addicted to the authority of the directory specified by expdp (for beginners)
How to trick and use a terrible library that is supposed to be kept globally in flask
Use a scripting language for a comfortable C ++ life 5 --Use the Spyder integrated environment to check numerical data-
What to do if the user name is changed and the pyenv library path does not pass
What is the activation function?
What is the Linux kernel?
What is a decision tree?
What is a Context Switch?
What is a super user?
I tried to make a serial communication single function module that controls the servo motor on the Petit Robo board in C language
What is a system call
[Definition] What is a framework?
What is the interface for ...
Try to select a language
What is a callback function?
What is the Callback function?
The image is a slug
What is a python map?
Part 2 Using the SmartHR library kiji to run e-Gov (e-Gov public materials)
[AWS] What to do when the ping command causes a "timeout"
[C language] [Linux] Try to create a simple Linux command * Just add! !!