This article shares the security operations of more than 10,000 terminals in a large enterprise from 0 to 1, which improves the installation rate, normal rate, and compliance rate to the extreme.
As companies attach more and more importance to the difficulty of mining server and application vulnerabilities, there is a post on the website ” Now websites are getting harder and harder to penetrate. Is there any future for penetration testing? ” . Then, the security confrontation on the terminal will be a trend.
In addition, as more and more traffic encryption at the network layer causes NTA monitoring to fail, EDR and terminal data security protection technologies are developing, and it is also a trend to have a security agent on the host side. The trend on the terminal side is to upgrade, and the server side is from scratch. HIDS / HIPS is deployed to enhance the host-layer protection detection capability.
Based on this, we carry out terminal security operations on the terminal side and server security operations on the server side to strengthen the company’s internal host-level protection and detection capabilities.
Zhang Youzhi is the head of the terminal security operation of our network security department. He is responsible for the security operation of more than 10,000 terminals of the company from 0 to 1. It improves the installation rate, normal rate, and compliance rate to the extreme. He shared this “painful” process at a meeting. The process was painful and the results were better.
Key words of this article: installation rate, normal rate, SOP, validity detection, incremental problems, user habits, EDR, plus white detection.
Since we are going to talk about the practice of terminal security operations today, we should not stack and elaborate some conceptual things too much, but sink in. After all, practice true knowledge, and practice is the sole criterion for testing truth. Here we mainly talk about three aspects:
- The development process and current status of the company’s terminal security operations. In the past year or more, what terminal security risks and hidden dangers have we found, what problems and obstacles have we encountered, what operational ideas and processes have been explored, and what are the effects? What is the operating status of
- What common problems have we found in terminal security operations, which are the so-called pain points that everyone cares about. Some of these problems have mature operating ideas and processes to deal with, and some of us have not found the best practices. , Still trying and trying again and again;
- Chat in January this year, we carried out an offensive and defensive exercise internally. In this offensive and defensive exercise, what role did terminal security play and what role did we play, and what terminal security problems did we find out.
I. Yesterday and today of the company’s terminal security operations
The company’s overall operation of terminal security is based on the following three points:
The requirements for security operations are mainly around two aspects, high visibility and high operability:
High visibility is more about the requirements of the data, including the breadth, depth, and accuracy of the data. At this stage, it is more reflected in quantitative indicators.
High operability is more about process requirements, including standardized processes in management, technical stability and even automation. At this stage, it is more reflected in standardized operating processes, that is, SOP.
Our overall thinking at this stage is to focus on indicators and SOPs to improve terminal security indicators through some standardized operating methods and operating procedures, so as to solve the problem of terminal security and improve the level of terminal security. In the future, we will mainly Development in these directions, one is to expand and deepen the indicators, try to add viruses, patches, software, EDR, DLP and so on, on the other hand, gradually from SOP to automated operation, which is the ultimate form of SOP.
Now that we have talked about the indicators, let’s take a look at the changes in the company’s terminal security indicators over the past year:
After repeated adjustments, we determined the four basic indicators in the figure: installation rate, real-name rate, normal rate, and compliance rate. These four basic indicators can clearly show the basic protection level of all terminals in the internal network.
And when we review the changes of the four terminal security basic indicators for more than a year, we found an unexpected situation at first glance: the indicator is not a continuous increase in imagination, but a significant increase and decrease. fluctuation. When I saw this chart, the first reaction in my mind was:
Isn’t this the legendary “Duck effect”?
The Duck Effect talks about the stages of human’s cognition process: peaks of ignorance, valleys of despair, slopes of enlightenment, and steady plateaus , that is, you do n’t know you do n’t know, you know you do n’t, The process I know is exactly in accordance with the several stages of our security operations. We cannot find problems, find problems that cannot be solved, and try to solve them.
The first stage is also the most interesting point in the Duck effect. The so-called ignorance mountain peak is the stage where no problem can be found. When we start to realize that the problem exists, we begin to move from the ignorance mountain peak to the next stage. .
Prior to this, we also had related requirements and control measures, requiring all terminals to install our terminal security software. We have always believed that all terminals should be installed with terminal security software under the premise of relevant requirements and access control measures, and be in a protected state. Until one day, an abnormal IP behavior was found on the traffic side, and it was determined that this terminal was infected with eternal blue. We have a corresponding SOP for the infection of Eternal Blue, but when we confidently confirmed this terminal, we found that the information of this terminal could not be found on the console, and the user could not be located. In the end, only It can be found through various login logs, which took nearly two hours, far exceeding the disposal time we previously prescribed. In security operations, we have always upheld the idea that there must be ten or even one hundred anomalies behind an anomaly . Just like an iceberg, you always see the tip of the iceberg floating on the sea. So, when we encountered such a situation, the first question in our mind was:
- Why is there an access control, and there are still terminals that cannot be located, does it mean that our control methods have failed?
- So the question then comes, when is the time of failure? What is the scope of the failure? What is the cause of the failure?
Once we started to think, some problems that were not discovered before, and some problems that were previously discovered but ignored were all emerging one by one: IT did not have an asset account, many BYOD devices, and many VPN accesses. , There are many types of business, there is no collaboration with operation and maintenance IT, no asset operation, no real control means, etc.
When we face this bunch of problems, when we find that we are already on the peak of ignorance, the first thing we try to do is to visualize these problems, quantify the actual situation, and use data to Show us what our real situation is, what problems we have, and how serious the problems are. This is how our concept of terminal security indicators was born.
2. Valley of Despair
After we set the terminal security indicators, we started to try to make the indicators more accurately reflect the real situation, and then improve the indicators. However, this stage can be said to be the most painful stage in the whole year, which is called the Valley of Despair.
At this stage, the problem expands infinitely in both the horizontal and vertical directions, as if it is a process of tearing a mouth larger and deeper. From a horizontal perspective, the scope of the problem is expanding, and more and more aspects are involved. From a vertical perspective, the depth of the problem is increasing. Many problems have surfaced from the surface, become deeper, and even touch some very sensitive issues. Point.
The first problem we encountered was actually the main reason for the sharp decline of the indicator in the one-year indicator chart. The reason for the sharp decline in the indicator is simple, because our denominator is getting bigger and bigger.
After repeated inspections of indicators and reorganization of assets, we found that more and more terminal assets and other assets, such as network segment assets. At the same time, we have found more and more black spots that are not covered by regulation. In fact, we have been discovering these points that are not covered by the control. For example, before, we demonstrated a client’s terminal security operations in a company’s external display conference room. We are going to show you the control strategy and show the situation where the terminal security management and control software is not installed and cannot be accessed through the internal network.
As a result, the show turned out to be a big face.
Our computer was directly connected to the internal network without plugging in terminal security software. The scene was very embarrassing. When we checked it later, we found that this conference room was divided into a special IP network segment, and this network segment, we just thought that there would be no office terminal, so access control was not turned on.
The second desperate situation we encountered was when we started to improve these terminal security indicators, we sent out a lot of notification emails, asking all members of the company to cooperate with us for some terminal hardening work.
We regret that the response we are looking forward to is overwhelming. The hot scene has not happened. All the notification emails are sinking into the sea without any response.
We have tried some management methods to promote, for example, we require all departments to designate a safety commissioner, and we hope to promote the work items to be carried out in their respective departments through the safety commissioner. These security officers were indeed very enthusiastic at first and cooperated very well with our work, but less than two weeks later, we discovered that these security officers no longer cooperated with us. There are many reasons. For example, our regulations are not clear enough. For another reason, it is also the most important reason. We can provide too few technical means. We can only provide a list of data to be completed. We fully expect to rely on the security commissioner Promote such a simple management means within the department to complete. This failure also made me a clear point:
Simple management means have no value.
It can be said that house leaks are in the midnight rain. When we are tired of dealing with various problems, the company has gradually ushered in the peak period of employment. During the peak period of employment, the company may have to recruit 80 to 100 people each week. Some of these newly recruited colleagues use their own BYOD equipment, and have not performed any terminal security software-related work at all. Often, the computer does not have a real name, the terminal security software stays in the ancient version, and the security baseline is not in compliance. Then a very embarrassing situation is that every induction day on Tuesday has become a nightmare for the terminal security project team. Maybe the previous week, we have managed to resolve 100 problematic terminals. Once the induction day came, the previous week basically quit Already.
Another very difficult issue is that as we rolled out the work of terminal security operations on a large scale, we encountered large-scale queries and rebounds from employees across the company. Let’s take the example of promoting real-name authentication of a terminal, for a certain period of time, I dare not read my blue letter news. When I opened it, I was ruthlessly bombarded in various groups. There are two more consistent spray points:
The first is to question the accuracy of the data. Some of our data are indeed flawed. For example, the coverage of the data source is not sufficient. For example, the data is not updated in time. On the other hand, the user feels that our data does not meet his psychological expectations, so it is a classic problem to question the inaccuracy of the data. Where did your data come from? Is there a problem?
The second is that the rules are not clear and there are too many meaningless tasks, or the users think that there are too many meaningless tasks. For example, we believe that real-name authentication is of the terminal dimension. If a person has multiple terminals, each of them needs real-name authentication, but in fact, we have found that many users understand it after a long time. Real-name authentication refers to the human dimension, as long as real-name authentication is performed on any terminal. The last classic spray point is that when users have questions or technical problems, there is no smooth feedback channel and they can only vent their dissatisfaction.
The last and most serious problem we face is the lack of continuous follow-up on work items. Often when we think of a work item formulated a week ago or two weeks ago, we find that it has completely deviated from the expected effect, and even no one is following up on this work item. This problem is typical and does not meet the requirements of high operability.
Now that we have fallen into the valley of despair, we must find a way out. We try to think of ways from several major aspects, such as polishing and optimizing products, expanding the functions of mining products, and improving the effectiveness and stability of technical means. We have divided and disassembled some problems and scenarios, such as In terms of problems, we sent a front-line classmate with deep incubation. One person has been employed four or five times, dismantled every link of the employment, and embedded the work items related to terminal security. For example, we simultaneously Attempting top-down and bottom-up management methods not only put pressure on the top leadership, but also carried out a lot of single-point contact work.
3. The slope of enlightenment
After struggling repeatedly and constantly hitting the south wall, we began to explore some effective ideas and processes, and the indicators have also entered a stage of continuous improvement, or we were enlightened.
What did we do before and after this stage? The first thing we do is to take the system first and complete the relevant system requirements to ensure that we have “legal compliance” in our operations. For example, the official “Administrative Measures for the Use of Computers by Company Employees” and “Administrative Measures for the Entry and Exit of Company Employees” were issued.
Secondly, in terms of ensuring the effectiveness of access, we have sorted out all the company’s office network segments, and also checked all VPN servers, covering all internal network entrances to avoid window-level protection, but the result is Forgot to close the door.
At the same time, our indicator operation relies heavily on the real-name information of the terminal, so it took a lot of effort to dig and locate the unowned terminal. Through terminal security software data, VPN authentication logs, 802.1x authentication logs, switch authentication logs, and even The linkage of access logs, terminal EDR logs, terminal file names, and other data of some systems locates 1500 endless terminals before and after.
To be good at work, we must first sharpen our tools. We have also raised a lot of questions and requirements for the terminal security product line, including 184 questions and 102 needs. These two figures can illustrate our efforts to optimize and improve our products. .
However, now when we look back and look at these requirements, especially some customized requirements, we increasingly feel:
Customization requirements are often not optimal solutions, and can even be called a double-edged sword.
Maybe everyone thinks that a highly customized product that is completely according to your own mind is very cool. I used to think so, but when we think about the customization requirements more rationally, we find that there are still many problems with customization. of. For example, the cost of customization, the higher cost includes the time cost of developing and verifying the customization requirements, including the maintenance costs that are accompanied at any time in subsequent version updates.
In addition, many customized requirements are actually temporary requirements, and the next version may be replaced by other higher and more widely applicable functions. It may be that the problem that this temporary requirement solves itself exists in a special period. Individual problems, then we spent a lot of effort to promote the customization function, it seems very bad. Therefore, we are now more cautious about customization than before. Customized requirements and functions cannot replace operating methods. In many cases, operating methods are often more cost-effective and more targeted.
Finally, we solidify these proven operational ideas and processes to form an SOP and establish operating standards. This step is also a step from quantitative change to qualitative change in the true sense. We have formulated 82 SOPs before and after. These 82 SOPs are the most effective guarantee for the quality of the entire operation.
4. Continuously stable plateau
Finally, we have pulled the timeline to the present. So, when we solved the various problems that emerged, and climbed out of each pit, what kind of state did our terminal security operations enter, I think, now, in this In the so-called continuous plateau, we have three good points.
The first point is that we guarantee to review all index pairs every day, and to confirm the reasons and scenarios behind 0.1% index changes. Operations that can achieve this accuracy have high requirements for the visibility of the index data. In this regard, we maintain a summary table of index enhancements, which contains indicators of all terminals throughout the company. Every terminal has recorded operations. The value of this table is very large and provides strong data support.
The second point is that we continue to refine the SOP. We rely on SOPs to work, but we are not superstitious about SOPs. Once there is a deviation between the SOP and the actual scene, we will make corrections to ensure that the SOP is highly applicable to the company’s internal scene. On average, each SOP is adjusted and modified about 10 times . The normal configuration of the company’s terminal security operation team is four front-line students. However, because the internal project is a test field and a training ground, the degree of violent changes in our personnel far exceeds the normal situation in a year. There are a total of 37 colleagues in and out of the project. In this case, our project can still run smoothly and the indicators can be steadily improved.
This is the power of SOP, the power of standardized processes.
The last point is daily and weekly. In the traditional sense, daily and weekly reports may be shown to non-working people, so that they have a grasp of the project progress. However, in my opinion, the daily and weekly newspapers have an important value, which is to give people engaged in daily work an opportunity to jump out and look at the entire project, or in other words, open up a God perspective for everyone, Gauguin’s overall perspective to master the progress, find problems and risks, in fact, many of our problems and ideas are found in the process of collating data and writing daily reports.
2. When we talk about terminal security operations, what do we care about?
Division of operations
The so-called range division is actually based on a question, which terminals are the terminals we care about and the terminals we want to manage. The first thing we often do is to calculate the denominator. All terminal security indicators theoretically use the same denominator. This denominator is the range of terminals we care about and need to manage. At the beginning of an endpoint security project,
We have revised the denominator algorithm 17 times before and after, with only one purpose, to calculate the denominator fully and accurately.
In this 17 version of the algorithm, we tried a variety of different dimensions and data sources, such as the dimension of people, but could not solve the situation of multiple terminals with one person, such as the dimension of the IT asset ledger, but could not handle BYOD equipment. We tried the method of end-users’ voluntary reporting, and employees reported the terminal equipment they used, but obviously, this management method could not be implemented, and there was no way to ensure the timely update and synchronization of information. In the end, we adopted completely independent of human cooperation and used objective data completely. Considering that there are only 802.1x and VPN entrances on our intranet, based on the authentication logs of these two entrances, the exceptions are removed to generate the denominator. . In addition, we also introduced the concept of time window, all data are selected within the past 30 days to avoid the problem of infinite increment caused by BYOD.
There are many sources of exceptions, such as special business requirements, such as temporary requirements. Our exception handling is actually a very typical process, from complex to simple, from simple to complex, such a spiraling process.
For the first time, from complicated to simple, we have divided all kinds of complicated situations and scenarios into two categories: exceptional people and exceptional terminals for easy calculation and management. They are distinguished by domain account and MAC address.
The second time from simple to complex, we added a whitelist review mechanism. For example, there are special business requirements for some projects, and terminal security software cannot be installed. Is the project now over? For example, some temporary usability problems are temporarily processed in white, then the problem is solved now, can it be installed back? When we do not have this mechanism, the white list must be infinitely expanded.
The third time from complex to simple, we tried to establish a standard process for exception application, review, and management, and added some automated application methods. In the future, we are about to carry out the fourth time from simple to complex. For terminals that have added exceptions and do not install terminal security software, we will also use some alternative management and control methods. This is the spiral of the entire exception mechanism.
Another point in the scope division is to distinguish high-value terminals, which is also part of the refined operation. For high-value terminals, we are not limited to different requirements on the indicators. At the same time, we also specify some VIP rules specifically for high-value terminals and high-value users. For example, when the account of a VIP user logs in to a new device, an alarm is generated. For example, when the account of a VIP user dials into a VPN overseas, a corresponding alarm is generated. We have even adopted some highly customized rules. For example, in February, we updated a new major version. We hope that some key terminals can be updated to the new version as soon as they are connected to the intranet. As soon as these key terminals returned the log, they started monitoring his entire upgrade. From some perspectives, these high-value terminals are also like a test field. These customized rules may be converted into automated detection methods and extended to the entire range of terminals.
2. Effectiveness of control measures
In fact, the effectiveness of control is the most easily overlooked problem. Almost every new student who joins the terminal security project will ask me a question, why do we turn on admission control, but the installation coverage is not 100%? I don’t know if you have thought about this issue and opened the access control. Is the installation coverage 100%? If not, why not?
In fact, there are many reasons. I will list a few casually:
Is the access coverage comprehensive: Are all the IP network segments covered? Is there a black light like the special IP in the conference room, as mentioned above, are all VPN servers covered?
Whether access is bypassed: Some may be intentionally bypassed. For example, VPN access often detects the process name, so fake process names can be easily bypassed, and others may be inadvertently bypassed, such as using Older versions of VPN clients without access features, terminal security software clients with other consoles installed, etc.
Whether the data is accurate: Distortion of some data has also caused deviations in installation rates. For example, in the VPN log we use, a complete authentication action for connecting to the internal network includes 4 logs. We initially used the log of the “Assigned Service List” as a sign of connecting to the internal network. However, we later found that when we tried to connect to the VPN, but did not install the terminal security software and was intercepted by the admittance, this log would also be generated, so the denominator was falsely high, that is, the installation coverage was low. .
The result is the concept of failure point detection that we have repeatedly emphasized. Junge often emphasizes to us:
Only with the awareness and action of failure detection can it be counted as the gate of safe operation.
When there are problems with our control methods, such as black spots, such as when someone bypasses them, how long can we be aware of them? If it is pure passive perception, we may realize that when there are failure points, more serious security problems have often occurred. We must maintain a proactive and targeted detection method, such as the simplest. We issue requests to connect to the internal network on a regular or irregular basis through some terminals that do not have terminal security software installed. Once they can connect, immediately Generate an alert.
3. User experience issues
Another issue brought about by control measures is user experience, which is also a concern of almost all companies. We must first make clear that user experience is not an absolute concept. In other words, there is no absolute way to improve user experience. Any control method will inevitably affect the user experience, but we can still try to do something, such as:
We can try to use some pure technical means to reduce the user’s participation, even without the user’s participation at all, complete without the user’s perception, and avoid the user experience. An example is real-name authentication. We have always asked users to help themselves Efforts to perform real-name authentication were not very successful. We simply resolved this action through technical means. Through the VPN client, the domain account that the user logged in to was automatically sent to the interface of the terminal security software client, and the client completed the information. Automatically complete real-name authentication.
User experience is a contrasting concept, which often comes from a line drawn by the user himself, so we try to lower this line, which is equivalent to improving the user experience.
One idea we have instilled over and over again is that there is no one-size-fits-all option. For the performance problems that users complain about, we insist on using data to speak, provide users with some pages to report their performance problems, and provide some data and screenshots such as CPU, memory, and network. In fact, 10 people may complain about performance problems. It is unbearable, and only one or two people are really willing to report the problem. If the feedback channel is provided and the feedback is rejected, this so-called performance problem is I can ignore.
Finally, we also strive to ensure that the problem feedback channel is unblocked. Various problem feedback groups and feedback mailboxes are guaranteed to have a response within 5 minutes during the working hours, to ensure that issues that really affect usability, and someone can assist in troubleshooting. I don’t think attitudes can make up for the user experience, but reasonable mitigations are certainly possible.
4. Incremental response
Any terminal security operation work item revolves around solving the inventory and responding to the incremental aspects. The incremental issue, especially the persistent incremental issue, is particularly difficult.
For most companies, the main increase comes from new hires. We believe that new hires are both a challenge and an opportunity. On the one hand, new hires mean that a number of new terminals are pouring in, which may not be installed and may It is not a real name, and may not be compliant. On the other hand, the new employee entry is also a so-called golden window. After all, the week of new employment may be the most obedient week for employees. A lot of indoctrination and publicity at this stage are often more effective. of. Therefore, through cooperation with human resources, IT, and operation and maintenance departments, we have embedded terminal security into almost every link of employment:
On the morning of induction day, new employees will receive a short induction training. In this training, new employees will be informed that we have terminal security software.
After that, new employees have to go to the IT service desk to pick up office equipment. Behind this, we have a cooperation mechanism with IT and operation and maintenance to ensure that all companies have installed terminal security software in the installed images used by computers and that the version remains the same. up to date. In addition, new employees will receive a hard copy of the terminal security instructions when they receive the computer, which details the required terminal security software operations.
When the employee returned to the department and turned on the computer to connect to the internal network, we realized the functions of upgrading immediately after entering the network and security immediately after entering the network. When a new employee starts to work, the first thing is to confirm that his email account has been opened. At this time, the employee will see that we have a preset email in the mailbox, which contains not only all Terminal security software related content, as well as various channels for problem feedback.
In addition, one week after the employee joins the company, two days of new employee induction training will be conducted. During the training, there will be half a day to specifically explain the safety-related content, and a return visit will be conducted to ask if everyone has implemented the matters in the induction notice.
Another scenario with continuous increase is high-risk software on the terminal. This year, we started to operate high-risk software on the terminal, and developed indicators such as the ratio of high-risk software terminals. We classified software and software versions with publicly known high-risk vulnerabilities into high-risk software. Obviously, this is another concept with continuous increase, so we maintain a list of high-risk software, and have carried out a series of automated attempts to automatically use the users of terminals with high-risk software through email scripts and Lanxin robots. Notice asking for upgrade.
5. Version upgrade issues
Now that we have mentioned the word upgrade, let’s talk about upgrading the version of the terminal security software, which may be the most feared thing for terminal security staff. Dare to upgrade is probably the strongest problem in common. We use the standards of the financial industry to benchmark and have established a standardized upgrade process, from the verification of the new version, to the rollback scheme, to the entire network notification, and then to the grayscale. upgrade. Let’s use data to talk. In one year, the company’s terminal security software console has been upgraded 21 times, and 63 replacements of the scattered files that need to restart the service have been performed. This proves that under the premise that the product itself’s documentation and the upgrade process are strictly standardized, the upgrade is not so terrible.
6. Communication and collaboration mechanism
When it comes to collaboration, it has to be said that in any company, security personnel and operation and maintenance personnel are enemies who love and kill each other. In the matter of establishing a communication and cooperation mechanism with operation and maintenance, or other departments, we take more events to promote. For example, in November last year, a serious accident occurred because of poor communication between the two parties. The reason is actually very simple. In the security check of the terminal security software, one of the password strength checks is that the password expiration time must be less than 6 months, but the operation and maintenance have not known the domain control group temporarily because of business needs The password expiration time in the policy causes all security domain terminals to fail the security check and be isolated from the network. Through the review of this accident, we and O & M established a synchronous communication mechanism for domain control group policies. In a sense, the accident was turned into an opportunity, which is a typical event-driven.
7. Change user habits
The last common problem I want to talk about is reshaping user habits.
I have to say that everyone is an adult, and it is really difficult to reshape your user habits.
We are still trying event-driven ideas, such as popular telecommuting at the moment. Before that, our telecommuting has always used the BYOD + VPN method, which has some security risks. Before some BYOD devices are connected to the internal network, they already carry high-risk vulnerabilities and even viruses. In addition, some sensitive data was placed in the BYOD device, which caused data leakage and loss. Taking this opportunity of remote office work, we gradually transferred all remote access to sensitive systems to the cloud desktop to gradually prohibit VPN access to sensitive systems.
In addition, we have always insisted on a point:
Presence is always good, good sound and bad sound are better than no sound.
Here you may wish to consider a question, how many people will talk about terminal security software and how many people will talk about terminal security management and control measures within the enterprise. After more than a year of persistence in brushing presence, we found that the number of people talking about us has obviously increased. The most typical example is that in the feedback group of various IT problems, some people say that their computer is not connected to the Internet and cannot access a certain system. There will definitely be enthusiastic people jumping out and asking if you have n’t Install the terminal security software. Isn’t the terminal security software security check? Even sometimes, this answer may have some flavor of a wok. However, we still believe that the existing presence can then have other desired effects.
Third, remember an offensive and defensive drill
In mid-January this year, we conducted an offensive and defensive exercise internally. We found a lot of interesting and valuable points, which may not be completely limited to narrow terminal security. Here we also share them with everyone.
1. Before the offensive and defensive drill
Before this offensive and defensive exercise, we made a lot of preparations. First, the coverage of the terminal security software can ensure that people can be located at the first time through an IP, a MAC address, a mid, or a host name. Get all relevant information and terminal logs. This preparatory work was completed over the course of a year.
In March of last year, we also conducted a similar offensive and defensive exercise internally. During that offensive and defensive exercise, we were still a little stretched in terms of finding people, including the terminal EDR log, and we only temporarily collected some key personnel. The two comparisons are very obvious. In the event of emergency, finding someone is already a time-consuming link that can be almost ignored. In the case of large-scale concurrency of offensive and defensive drills, we have saved a lot of valuable time.
Another preparation that is also accumulating, not temporary, is security awareness training, especially security awareness training for phishing emails. Also in the last offensive and defensive drill, the attacking team tried the method of phishing emails, and it was easy. This time, the attack team also tried to use phishing emails to deliver samples, and this phishing email was forged very realistically, completely imitating our internal notification email, asking everyone to upgrade the VPN client, and the attachment of the email, using A completely normal VPN installation package, and a compressed file packaged with malicious samples. On the one hand, this time we detected this email for the first time and carried out emergency treatment. On the other hand, it was even more commendable that this phishing email was quickly fed back to the Network Security Department. In addition, during this offensive and defensive exercise, human resources and other departments reported almost all suspicious emails.
So in addition to these accumulated work, before the offensive and defensive exercise, a key preparatory work that we performed was also the preparatory work that was often overlooked before, that is, the security of the security software itself, which is also a problem we often ignore. The powerful functions of the terminal security software, including remote desktop, file distribution, and task distribution, will be devastating if they are used by an attacker.
Prior to this offensive and defensive exercise, we spent almost a month to harden the console almost like never before. We completely split the console deployment, completely separate the application server used by the terminal, the management server used by the management background, and the database server, and have strict access restrictions between each other. The entire console Only allow a few required ports for the terminal to access the application server. In addition, in order to avoid such hidden dangers as file distribution, we have completely blocked the port of the management server at the expense of some availability, ensuring that the terminal cannot get any files from the management server. At the same time, throughout the process, we closely monitored the console-related network access traffic and background audit logs.
2. Offensive and defensive exercises
Well, in this offensive and defensive exercise, in addition to phishing emails, because the time is relatively tight, the attack team mainly uses physical penetration. Let’s briefly restore the action path of the attack team:
In the early hours of the first day of the offensive and defensive exercise, the attack team attacked the office area at night. They penetrated into the office area through the underground garage, and found some computers that were left at the work station without being shut down. They inserted sample files by inserting a USB flash drive. Controlled a batch of terminals.
So from our perspective, how do you find out that these terminals were charged and then discover the behavior of the attacking team?
First of all, we detected the abnormal behavior of ip through the traffic side and terminal side detection, and quickly located the terminal and the user. This is the guarantee of the installation coverage and real-name rate of the terminal security software. Later, after checking the samples obtained on the computer, it was found that the samples were transmitted using a USB flash drive. Therefore, physical penetration was suspected. The attack was confirmed by taking surveillance video. Finally, by searching for the terminal that has the same sample md5 and the U disk that has used the same SN number, other controlled terminals are removed. This is where EDR is playing its value.
3. After the offensive and defensive drill
We have repeatedly emphasized that the purpose of offensive and defensive exercises is to find problems and discover risk points, so the review after the offensive and defensive exercises is extremely valuable. For the review of this offensive and defensive exercise, we found three obvious terminal security issues. These three issues have also been neglected:
The first problem is the problem of not shutting down after work. In fact, the company has always been required to shut down after work and restart the office computer within 24 hours. However, the lack of effective management and control of technical means has also proved that the simple management methods cannot be implemented. In the future, the shutdown and restart of office computers will be controlled through technical means, and the system will be implemented.
The second problem is the lack of management and control of USB flash drives and mobile storage devices. We will improve on three aspects: requiring only registered USB flash drives to be used on office computers, prohibiting the automatic playback function of USB flash drives, and Detection of abnormal behavior of U disk, such as detection of newly inserted U disk, use of the same U disk in multiple terminals, etc. These three aspects, in addition to the stable implementation of technical means, need to consider changes in user habits.
The last problem is the management and control of white terminals, which was mentioned earlier. For exceptions, it is from simple to complex. During this offensive and defensive exercise, there was a case in which the white terminal was charged. Because terminal security software was not installed, we had no perception. Therefore, we reorganized the whitening process, and added the terminal log collection client instead of the terminal security software as a necessary condition for whitening.
In addition to terminal security operations, we continue to practice security asset management operations, server operations, vulnerability operations, and data security operations, and then look for opportunities to report, share, and communicate with you.